Saved in:
| Main Authors: | Xu, Feng, Zhai, Guangyao, Kong, Xin, Fu, Tingzhong, Gordon, Daniel F. N., An, Xueli, Busam, Benjamin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.05107 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
by: Apanasevich, I., et al.
Published: (2026)
by: Apanasevich, I., et al.
Published: (2026)
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs
by: Zhai, Guangyao, et al.
Published: (2023)
by: Zhai, Guangyao, et al.
Published: (2023)
RobustVLA: Robustness-Aware Reinforcement Post-Training for Vision-Language-Action Models
by: Zhang, Hongyin, et al.
Published: (2025)
by: Zhang, Hongyin, et al.
Published: (2025)
SA-VLA: Spatially-Aware Flow-Matching for Vision-Language-Action Reinforcement Learning
by: Pan, Xu, et al.
Published: (2026)
by: Pan, Xu, et al.
Published: (2026)
GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
by: Abouzeid, Ali, et al.
Published: (2025)
by: Abouzeid, Ali, et al.
Published: (2025)
VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators
by: Li, Hengtao, et al.
Published: (2025)
by: Li, Hengtao, et al.
Published: (2025)
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies
by: Hu, Xintong, et al.
Published: (2026)
by: Hu, Xintong, et al.
Published: (2026)
Reinforcement Fine-Tuning of Flow-Matching Policies for Vision-Language-Action Models
by: Lyu, Mingyang, et al.
Published: (2025)
by: Lyu, Mingyang, et al.
Published: (2025)
EdgeVLA: Efficient Vision-Language-Action Models
by: Budzianowski, Paweł, et al.
Published: (2025)
by: Budzianowski, Paweł, et al.
Published: (2025)
PAPO-VLA: Planning-Aware Policy Optimization for Vision-Language-Action Models
by: Guo, Peizheng, et al.
Published: (2026)
by: Guo, Peizheng, et al.
Published: (2026)
SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models
by: Li, Meng, et al.
Published: (2025)
by: Li, Meng, et al.
Published: (2025)
GesVLA: Gesture-Aware Vision-Language-Action Model Embedded Representations
by: Guo, Wenxuan, et al.
Published: (2026)
by: Guo, Wenxuan, et al.
Published: (2026)
ReFineVLA: Reasoning-Aware Teacher-Guided Transfer Fine-Tuning
by: Van Vo, Tuan, et al.
Published: (2025)
by: Van Vo, Tuan, et al.
Published: (2025)
ST4VLA: Spatially Guided Training for Vision-Language-Action Models
by: Ye, Jinhui, et al.
Published: (2026)
by: Ye, Jinhui, et al.
Published: (2026)
OccVLA: Vision-Language-Action Model with Implicit 3D Occupancy Supervision
by: Liu, Ruixun, et al.
Published: (2025)
by: Liu, Ruixun, et al.
Published: (2025)
StyleVLA: Driving Style-Aware Vision Language Action Model for Autonomous Driving
by: Gao, Yuan, et al.
Published: (2026)
by: Gao, Yuan, et al.
Published: (2026)
TacVLA: Contact-Aware Tactile Fusion for Robust Vision-Language-Action Manipulation
by: Zhang, Kaidi, et al.
Published: (2026)
by: Zhang, Kaidi, et al.
Published: (2026)
AffordVLA: Injecting Affordance Representations into Vision-Language-Action Models via Implicit Feature Alignment
by: Kong, Weijie, et al.
Published: (2026)
by: Kong, Weijie, et al.
Published: (2026)
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
by: Zhong, Linqing, et al.
Published: (2026)
by: Zhong, Linqing, et al.
Published: (2026)
SeqVLA: Sequential Task Execution for Long-Horizon Manipulation with Completion-Aware Vision-Language-Action Model
by: Yang, Ran, et al.
Published: (2025)
by: Yang, Ran, et al.
Published: (2025)
StereoVLA: Enhancing Vision-Language-Action Models with Stereo Vision
by: Deng, Shengliang, et al.
Published: (2025)
by: Deng, Shengliang, et al.
Published: (2025)
RLinf-VLA: A Unified and Efficient Framework for Reinforcement Learning of Vision-Language-Action Models
by: Zang, Hongzhi, et al.
Published: (2025)
by: Zang, Hongzhi, et al.
Published: (2025)
KineVLA: Towards Kinematics-Aware Vision-Language-Action Models with Bi-Level Action Decomposition
by: Han, Gaoge, et al.
Published: (2026)
by: Han, Gaoge, et al.
Published: (2026)
OpenVLA: An Open-Source Vision-Language-Action Model
by: Kim, Moo Jin, et al.
Published: (2024)
by: Kim, Moo Jin, et al.
Published: (2024)
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
CoFreeVLA: Collision-Free Dual-Arm Manipulation via Vision-Language-Action Model and Risk Estimation
by: Zhai, Xuanran, et al.
Published: (2026)
by: Zhai, Xuanran, et al.
Published: (2026)
AIR-VLA: Vision-Language-Action Systems for Aerial Manipulation
by: Sun, Jianli, et al.
Published: (2026)
by: Sun, Jianli, et al.
Published: (2026)
RynnVLA-002: A Unified Vision-Language-Action and World Model
by: Cen, Jun, et al.
Published: (2025)
by: Cen, Jun, et al.
Published: (2025)
AC^2-VLA: Action-Context-Aware Adaptive Computation in Vision-Language-Action Models for Efficient Robotic Manipulation
by: Yu, Wenda, et al.
Published: (2026)
by: Yu, Wenda, et al.
Published: (2026)
ReconVLA: An Uncertainty-Guided and Failure-Aware Vision-Language-Action Framework for Robotic Control
by: Chen, Lingling, et al.
Published: (2026)
by: Chen, Lingling, et al.
Published: (2026)
DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models
by: Zheng, Zihao, et al.
Published: (2026)
by: Zheng, Zihao, et al.
Published: (2026)
ProgressVLA: Progress-Guided Diffusion Policy for Vision-Language Robotic Manipulation
by: Yan, Hongyu, et al.
Published: (2026)
by: Yan, Hongyu, et al.
Published: (2026)
MergeVLA: Cross-Skill Model Merging Toward a Generalist Vision-Language-Action Agent
by: Fu, Yuxia, et al.
Published: (2025)
by: Fu, Yuxia, et al.
Published: (2025)
PriorVLA: Prior-Preserving Adaptation for Vision-Language-Action Models
by: Guo, Xinyu, et al.
Published: (2026)
by: Guo, Xinyu, et al.
Published: (2026)
FocusVLA: Focused Visual Utilization for Vision-Language-Action Models
by: Zhang, Yichi, et al.
Published: (2026)
by: Zhang, Yichi, et al.
Published: (2026)
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models
by: Wang, Zixuan, et al.
Published: (2026)
by: Wang, Zixuan, et al.
Published: (2026)
RedVLA: Physical Red Teaming for Vision-Language-Action Models
by: Zhang, Yuhao, et al.
Published: (2026)
by: Zhang, Yuhao, et al.
Published: (2026)
NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models
by: Zhu, Ziyue, et al.
Published: (2026)
by: Zhu, Ziyue, et al.
Published: (2026)
FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model
by: Xu, Xiaoxu, et al.
Published: (2026)
by: Xu, Xiaoxu, et al.
Published: (2026)
Similar Items
-
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots
by: Apanasevich, I., et al.
Published: (2026) -
SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs
by: Zhai, Guangyao, et al.
Published: (2023) -
RobustVLA: Robustness-Aware Reinforcement Post-Training for Vision-Language-Action Models
by: Zhang, Hongyin, et al.
Published: (2025) -
SA-VLA: Spatially-Aware Flow-Matching for Vision-Language-Action Reinforcement Learning
by: Pan, Xu, et al.
Published: (2026) -
GeoAware-VLA: Implicit Geometry Aware Vision-Language-Action Model
by: Abouzeid, Ali, et al.
Published: (2025)