:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Truongcao, Keith, Nhu, Christopher, An, Zijian, Nguyen, Phong, Cai, Siwei, Zhou, Lifeng
Format:	Preprint
Published:	2026
Subjects:	Robotics
Online Access:	https://arxiv.org/abs/2606.00966
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SeqVLA: Sequential Task Execution for Long-Horizon Manipulation with Completion-Aware Vision-Language-Action Model
by: Yang, Ran, et al.
Published: (2025)

LLM-Land: Large Language Models for Context-Aware Drone Landing
by: Cai, Siwei, et al.
Published: (2025)

CLAW: A Vision-Language-Action Framework for Weight-Aware Robotic Grasping
by: An, Zijian, et al.
Published: (2025)

Double Oracle Algorithm for Game-Theoretic Robot Allocation on Graphs
by: An, Zijian, et al.
Published: (2023)

Spatial Memory for Out-of-Vision Manipulation in Vision-Language-Action
by: Li, Pengteng, et al.
Published: (2026)

Large Language Models for Multi-Robot Systems: A Survey
by: Li, Peihan, et al.
Published: (2025)

Survey of Vision-Language-Action Models for Embodied Manipulation
by: Li, Haoran, et al.
Published: (2025)

BLURR: A Boosted Low-Resource Inference for Vision-Language-Action Models
by: Ma, Xiaoyu, et al.
Published: (2025)

Adaptive Action Chunking at Inference-time for Vision-Language-Action Models
by: Liang, Yuanchang, et al.
Published: (2026)

VLATest: Testing and Evaluating Vision-Language-Action Models for Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)

Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models
by: Song, Zijian, et al.
Published: (2026)

HapticVLA: Contact-Rich Manipulation via Vision-Language-Action Model without Inference-Time Tactile Sensing
by: Gubernatorov, Konstantin, et al.
Published: (2026)

VLA-REPLICA: A Low-Cost, Reproducible Benchmark for Real-World Evaluation of Vision-Language-Action Models
by: Huang, Alex S., et al.
Published: (2026)

DySL-VLA: Efficient Vision-Language-Action Model Inference via Dynamic-Static Layer-Skipping for Robot Manipulation
by: Yang, Zebin, et al.
Published: (2026)

VTLA: Vision-Tactile-Language-Action Model with Preference Learning for Insertion Manipulation
by: Zhang, Chaofan, et al.
Published: (2025)

VLAS: Vision-Language-Action Model With Speech Instructions For Customized Robot Manipulation
by: Zhao, Wei, et al.
Published: (2025)

DAM-VLA: A Dynamic Action Model-Based Vision-Language-Action Framework for Robot Manipulation
by: Peng, Xiongfeng, et al.
Published: (2026)

LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation
by: Wang, Zhijie, et al.
Published: (2024)

AIR-VLA: Vision-Language-Action Systems for Aerial Manipulation
by: Sun, Jianli, et al.
Published: (2026)

Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation
by: Pei, Xiaohuan, et al.
Published: (2025)

Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
by: Zhang, Yihao, et al.
Published: (2025)

SG-VLA: Learning Spatially-Grounded Vision-Language-Action Models for Mobile Manipulation
by: Tu, Ruisen, et al.
Published: (2026)

Gaze2Act: Gaze-Conditioned Vision-Language-Action Policies for Interactive Robot Manipulation
by: Zuo, Kuangji, et al.
Published: (2026)

Safe-Night VLA: Seeing the Unseen via Thermal-Perceptive Vision-Language-Action Models for Safety-Critical Manipulation
by: Yu, Dian, et al.
Published: (2026)

TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation
by: Zhang, Kaidi, et al.
Published: (2026)

MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
by: Fu, Yuxia, et al.
Published: (2025)

Efficient Vision-Language-Action Models for Embodied Manipulation: A Systematic Survey
by: Guan, Weifan, et al.
Published: (2025)

ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models
by: Chahe, Amirhosein, et al.
Published: (2025)

Vision Language Action Models in Robotic Manipulation: A Systematic Review
by: Din, Muhayy Ud, et al.
Published: (2025)

MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
by: Shi, Hao, et al.
Published: (2025)

Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
by: Gbagbe, Koffivi Fidèle, et al.
Published: (2024)

Long-VLA: Unleashing Long-Horizon Capability of Vision Language Action Model for Robot Manipulation
by: Fan, Yiguo, et al.
Published: (2025)

VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation
by: Zhao, Han, et al.
Published: (2025)

CrayonRobo: Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
by: Li, Xiaoqi, et al.
Published: (2025)

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
by: Zhong, Yifan, et al.
Published: (2025)

Understanding Asynchronous Inference Methods for Vision-Language-Action Models
by: Agouzoul, Ayoub
Published: (2026)

Failing Forward: Adaptive Failure-Informed Learning for Vision-Language-Action Models
by: Zheng, Meng, et al.
Published: (2026)

Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization
by: Zhou, Jiaming, et al.
Published: (2025)

Concept-Based Dictionary Learning for Inference-Time Safety in Vision Language Action Models
by: Wen, Siqi, et al.
Published: (2026)

A Low-Cost Vision-Based Tactile Gripper with Pretraining Learning for Contact-Rich Manipulation
by: Liu, Yaohua, et al.
Published: (2026)