Saved in:
| Main Authors: | Zhang, Zhilong, Ren, Haoxiang, Sun, Yihao, Sheng, Yifei, Wang, Haonan, Lin, Haoxin, Wu, Zhichao, Bacon, Pierre-Luc, Yu, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.20607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation
by: Wu, Zhichao, et al.
Published: (2026)
by: Wu, Zhichao, et al.
Published: (2026)
Planning with Unified Multimodal Models
by: Sun, Yihao, et al.
Published: (2025)
by: Sun, Yihao, et al.
Published: (2025)
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
by: Zhong, Linqing, et al.
Published: (2026)
by: Zhong, Linqing, et al.
Published: (2026)
Anticipation-VLA: Solving Long-Horizon Embodied Tasks via Anticipation-based Subgoal Generation
by: Zhang, Zhilong, et al.
Published: (2026)
by: Zhang, Zhilong, et al.
Published: (2026)
Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
by: Zhang, Yihao, et al.
Published: (2025)
by: Zhang, Yihao, et al.
Published: (2025)
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
by: Sun, Jingwen, et al.
Published: (2026)
by: Sun, Jingwen, et al.
Published: (2026)
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
by: Zhai, Shaopeng, et al.
Published: (2025)
by: Zhai, Shaopeng, et al.
Published: (2025)
VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators
by: Li, Hengtao, et al.
Published: (2025)
by: Li, Hengtao, et al.
Published: (2025)
Toward Embodiment Equivariant Vision-Language-Action Policy
by: Chen, Anzhe, et al.
Published: (2025)
by: Chen, Anzhe, et al.
Published: (2025)
Reinforcement Fine-Tuning of Flow-Matching Policies for Vision-Language-Action Models
by: Lyu, Mingyang, et al.
Published: (2025)
by: Lyu, Mingyang, et al.
Published: (2025)
WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
by: Zhu, Fangqi, et al.
Published: (2025)
by: Zhu, Fangqi, et al.
Published: (2025)
OmniVLA-RL: A Vision-Language-Action Model with Spatial Understanding and Online RL
by: Jie, Haoxiang, et al.
Published: (2026)
by: Jie, Haoxiang, et al.
Published: (2026)
ReinVBC: A Model-based Reinforcement Learning Approach to Vehicle Braking Controller
by: Lin, Haoxin, et al.
Published: (2026)
by: Lin, Haoxin, et al.
Published: (2026)
NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models
by: Zhu, Ziyue, et al.
Published: (2026)
by: Zhu, Ziyue, et al.
Published: (2026)
RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models
by: Chen, Yuxuan, et al.
Published: (2025)
by: Chen, Yuxuan, et al.
Published: (2025)
GigaBrain-0: A World Model-Powered Vision-Language-Action Model
by: GigaBrain Team, et al.
Published: (2025)
by: GigaBrain Team, et al.
Published: (2025)
ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge
by: Dai, Yuntao, et al.
Published: (2025)
by: Dai, Yuntao, et al.
Published: (2025)
Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models
by: Sun, Ming, et al.
Published: (2026)
by: Sun, Ming, et al.
Published: (2026)
World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems
by: Li, Runze, et al.
Published: (2026)
by: Li, Runze, et al.
Published: (2026)
AR-VLA: True Autoregressive Action Expert for Vision-Language-Action Models
by: Hu, Yutong, et al.
Published: (2026)
by: Hu, Yutong, et al.
Published: (2026)
WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
by: Zhao, Baining, et al.
Published: (2026)
by: Zhao, Baining, et al.
Published: (2026)
RLinf-VLA: A Unified and Efficient Framework for Reinforcement Learning of Vision-Language-Action Models
by: Zang, Hongzhi, et al.
Published: (2025)
by: Zang, Hongzhi, et al.
Published: (2025)
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
by: Hung, Chia-Yu, et al.
Published: (2025)
by: Hung, Chia-Yu, et al.
Published: (2025)
Neural Implicit Action Fields: From Discrete Waypoints to Continuous Functions for Vision-Language-Action Models
by: Liu, Haoyun, et al.
Published: (2026)
by: Liu, Haoyun, et al.
Published: (2026)
SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
by: Liu, Haowen, et al.
Published: (2025)
by: Liu, Haowen, et al.
Published: (2025)
VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model
by: Guo, Yanjiang, et al.
Published: (2026)
by: Guo, Yanjiang, et al.
Published: (2026)
RynnVLA-002: A Unified Vision-Language-Action and World Model
by: Cen, Jun, et al.
Published: (2025)
by: Cen, Jun, et al.
Published: (2025)
GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
by: Sun, Lin, et al.
Published: (2025)
by: Sun, Lin, et al.
Published: (2025)
XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
by: Fan, Shichao, et al.
Published: (2025)
by: Fan, Shichao, et al.
Published: (2025)
VLM-SAFE: Vision-Language Model-Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving
by: Qu, Yansong, et al.
Published: (2025)
by: Qu, Yansong, et al.
Published: (2025)
ECHO: Continuous Hierarchical Memory for Vision-Language-Action Models
by: Hu, Yanbin, et al.
Published: (2026)
by: Hu, Yanbin, et al.
Published: (2026)
SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models
by: Li, Meng, et al.
Published: (2025)
by: Li, Meng, et al.
Published: (2025)
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
by: Wang, Yihao, et al.
Published: (2025)
by: Wang, Yihao, et al.
Published: (2025)
DriveWorld-VLA: Unified Latent-Space World Modeling with Vision-Language-Action for Autonomous Driving
by: jia, Feiyang, et al.
Published: (2026)
by: jia, Feiyang, et al.
Published: (2026)
Stable Language Guidance for Vision-Language-Action Models
by: Zhan, Zhihao, et al.
Published: (2026)
by: Zhan, Zhihao, et al.
Published: (2026)
WorldVLA: Towards Autoregressive Action World Model
by: Cen, Jun, et al.
Published: (2025)
by: Cen, Jun, et al.
Published: (2025)
Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model
by: Won, John, et al.
Published: (2025)
by: Won, John, et al.
Published: (2025)
V-VLAPS: Value-Guided Planning for Vision-Language-Action Models
by: Ren, Ke, et al.
Published: (2026)
by: Ren, Ke, et al.
Published: (2026)
$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
by: Intelligence, Physical, et al.
Published: (2025)
by: Intelligence, Physical, et al.
Published: (2025)
Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications
by: Kawaharazuka, Kento, et al.
Published: (2025)
by: Kawaharazuka, Kento, et al.
Published: (2025)
Similar Items
-
Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation
by: Wu, Zhichao, et al.
Published: (2026) -
Planning with Unified Multimodal Models
by: Sun, Yihao, et al.
Published: (2025) -
ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
by: Zhong, Linqing, et al.
Published: (2026) -
Anticipation-VLA: Solving Long-Horizon Embodied Tasks via Anticipation-based Subgoal Generation
by: Zhang, Zhilong, et al.
Published: (2026) -
Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
by: Zhang, Yihao, et al.
Published: (2025)