Saved in:
| Main Authors: | Truongcao, Keith, Nhu, Christopher, An, Zijian, Nguyen, Phong, Cai, Siwei, Zhou, Lifeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.00966 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SeqVLA: Sequential Task Execution for Long-Horizon Manipulation with Completion-Aware Vision-Language-Action Model
by: Yang, Ran, et al.
Published: (2025)
by: Yang, Ran, et al.
Published: (2025)
LLM-Land: Large Language Models for Context-Aware Drone Landing
by: Cai, Siwei, et al.
Published: (2025)
by: Cai, Siwei, et al.
Published: (2025)
CLAW: A Vision-Language-Action Framework for Weight-Aware Robotic Grasping
by: An, Zijian, et al.
Published: (2025)
by: An, Zijian, et al.
Published: (2025)
Double Oracle Algorithm for Game-Theoretic Robot Allocation on Graphs
by: An, Zijian, et al.
Published: (2023)
by: An, Zijian, et al.
Published: (2023)
Spatial Memory for Out-of-Vision Manipulation in Vision-Language-Action
by: Li, Pengteng, et al.
Published: (2026)
by: Li, Pengteng, et al.
Published: (2026)
Large Language Models for Multi-Robot Systems: A Survey
by: Li, Peihan, et al.
Published: (2025)
by: Li, Peihan, et al.
Published: (2025)
Survey of Vision-Language-Action Models for Embodied Manipulation
by: Li, Haoran, et al.
Published: (2025)
by: Li, Haoran, et al.
Published: (2025)
BLURR: A Boosted Low-Resource Inference for Vision-Language-Action Models
by: Ma, Xiaoyu, et al.
Published: (2025)
by: Ma, Xiaoyu, et al.
Published: (2025)
Adaptive Action Chunking at Inference-time for Vision-Language-Action Models
by: Liang, Yuanchang, et al.
Published: (2026)
by: Liang, Yuanchang, et al.
Published: (2026)
VLATest: Testing and Evaluating Vision-Language-Action Models for Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)
by: Wang, Zhijie, et al.
Published: (2024)
Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models
by: Song, Zijian, et al.
Published: (2026)
by: Song, Zijian, et al.
Published: (2026)
HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing
by: Gubernatorov, Konstantin, et al.
Published: (2026)
by: Gubernatorov, Konstantin, et al.
Published: (2026)
VLA-REPLICA: A Low-Cost, Reproducible Benchmark for Real-World Evaluation of Vision-Language-Action Models
by: Huang, Alex S., et al.
Published: (2026)
by: Huang, Alex S., et al.
Published: (2026)
DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation
by: Yang, Zebin, et al.
Published: (2026)
by: Yang, Zebin, et al.
Published: (2026)
VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation
by: Zhang, Chaofan, et al.
Published: (2025)
by: Zhang, Chaofan, et al.
Published: (2025)
VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
by: Zhao, Wei, et al.
Published: (2025)
by: Zhao, Wei, et al.
Published: (2025)
DAM-VLA: A Dynamic Action Model-Based Vision-Language-Action Framework for Robot Manipulation
by: Peng, Xiongfeng, et al.
Published: (2026)
by: Peng, Xiongfeng, et al.
Published: (2026)
LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)
by: Wang, Zhijie, et al.
Published: (2024)
AIR-VLA: Vision-Language-Action Systems for Aerial Manipulation
by: Sun, Jianli, et al.
Published: (2026)
by: Sun, Jianli, et al.
Published: (2026)
Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation
by: Pei, Xiaohuan, et al.
Published: (2025)
by: Pei, Xiaohuan, et al.
Published: (2025)
Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
by: Zhang, Yihao, et al.
Published: (2025)
by: Zhang, Yihao, et al.
Published: (2025)
SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation
by: Tu, Ruisen, et al.
Published: (2026)
by: Tu, Ruisen, et al.
Published: (2026)
Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation
by: Zuo, Kuangji, et al.
Published: (2026)
by: Zuo, Kuangji, et al.
Published: (2026)
Safe-Night VLA: Seeing the Unseen via Thermal-Perceptive Vision-Language-Action Models for Safety-Critical Manipulation
by: Yu, Dian, et al.
Published: (2026)
by: Yu, Dian, et al.
Published: (2026)
TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation
by: Zhang, Kaidi, et al.
Published: (2026)
by: Zhang, Kaidi, et al.
Published: (2026)
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
by: Fu, Yuxia, et al.
Published: (2025)
by: Fu, Yuxia, et al.
Published: (2025)
Efficient Vision-Language-Action Models for Embodied Manipulation: A Systematic Survey
by: Guan, Weifan, et al.
Published: (2025)
by: Guan, Weifan, et al.
Published: (2025)
ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models
by: Chahe, Amirhosein, et al.
Published: (2025)
by: Chahe, Amirhosein, et al.
Published: (2025)
Vision Language Action Models in Robotic Manipulation: A Systematic Review
by: Din, Muhayy Ud, et al.
Published: (2025)
by: Din, Muhayy Ud, et al.
Published: (2025)
MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
by: Shi, Hao, et al.
Published: (2025)
by: Shi, Hao, et al.
Published: (2025)
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
by: Gbagbe, Koffivi Fidèle, et al.
Published: (2024)
by: Gbagbe, Koffivi Fidèle, et al.
Published: (2024)
Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation
by: Fan, Yiguo, et al.
Published: (2025)
by: Fan, Yiguo, et al.
Published: (2025)
VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation
by: Zhao, Han, et al.
Published: (2025)
by: Zhao, Han, et al.
Published: (2025)
CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
by: Li, Xiaoqi, et al.
Published: (2025)
by: Li, Xiaoqi, et al.
Published: (2025)
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
by: Zhong, Yifan, et al.
Published: (2025)
by: Zhong, Yifan, et al.
Published: (2025)
Understanding Asynchronous Inference Methods for Vision-Language-Action Models
by: Agouzoul, Ayoub
Published: (2026)
by: Agouzoul, Ayoub
Published: (2026)
Failing Forward: Adaptive Failure-Informed Learning for Vision-Language-Action Models
by: Zheng, Meng, et al.
Published: (2026)
by: Zheng, Meng, et al.
Published: (2026)
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
by: Zhou, Jiaming, et al.
Published: (2025)
by: Zhou, Jiaming, et al.
Published: (2025)
Concept-Based Dictionary Learning for Inference-Time Safety in Vision Language Action Models
by: Wen, Siqi, et al.
Published: (2026)
by: Wen, Siqi, et al.
Published: (2026)
A Low-Cost Vision-Based Tactile Gripper with Pretraining Learning for Contact-Rich Manipulation
by: Liu, Yaohua, et al.
Published: (2026)
by: Liu, Yaohua, et al.
Published: (2026)
Similar Items
-
SeqVLA: Sequential Task Execution for Long-Horizon Manipulation with Completion-Aware Vision-Language-Action Model
by: Yang, Ran, et al.
Published: (2025) -
LLM-Land: Large Language Models for Context-Aware Drone Landing
by: Cai, Siwei, et al.
Published: (2025) -
CLAW: A Vision-Language-Action Framework for Weight-Aware Robotic Grasping
by: An, Zijian, et al.
Published: (2025) -
Double Oracle Algorithm for Game-Theoretic Robot Allocation on Graphs
by: An, Zijian, et al.
Published: (2023) -
Spatial Memory for Out-of-Vision Manipulation in Vision-Language-Action
by: Li, Pengteng, et al.
Published: (2026)