:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Zhilong, Ren, Haoxiang, Sun, Yihao, Sheng, Yifei, Wang, Haonan, Lin, Haoxin, Wu, Zhichao, Bacon, Pierre-Luc, Yu, Yang
Format:	Preprint
Published:	2026
Subjects:	Robotics Machine Learning
Online Access:	https://arxiv.org/abs/2603.20607
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation
by: Wu, Zhichao, et al.
Published: (2026)

Planning with Unified Multimodal Models
by: Sun, Yihao, et al.
Published: (2025)

ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models
by: Zhong, Linqing, et al.
Published: (2026)

Anticipation-VLA: Solving Long-Horizon Embodied Tasks via Anticipation-based Subgoal Generation
by: Zhang, Zhilong, et al.
Published: (2026)

Experiences from Benchmarking Vision-Language-Action Models for Robotic Manipulation
by: Zhang, Yihao, et al.
Published: (2025)

VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model
by: Sun, Jingwen, et al.
Published: (2026)

A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning
by: Zhai, Shaopeng, et al.
Published: (2025)

VLA-RFT: Vision-Language-Action Reinforcement Fine-tuning with Verified Rewards in World Simulators
by: Li, Hengtao, et al.
Published: (2025)

Toward Embodiment Equivariant Vision-Language-Action Policy
by: Chen, Anzhe, et al.
Published: (2025)

Reinforcement Fine-Tuning of Flow-Matching Policies for Vision-Language-Action Models
by: Lyu, Mingyang, et al.
Published: (2025)

WMPO: World Model-based Policy Optimization for Vision-Language-Action Models
by: Zhu, Fangqi, et al.
Published: (2025)

OmniVLA-RL: A Vision-Language-Action Model with Spatial Understanding and Online RL
by: Jie, Haoxiang, et al.
Published: (2026)

ReinVBC: A Model-based Reinforcement Learning Approach to Vehicle Braking Controller
by: Lin, Haoxin, et al.
Published: (2026)

NS-VLA: Towards Neuro-Symbolic Vision-Language-Action Models
by: Zhu, Ziyue, et al.
Published: (2026)

RLRC: Reinforcement Learning-based Recovery for Compressed Vision-Language-Action Models
by: Chen, Yuxuan, et al.
Published: (2025)

GigaBrain-0: A World Model-Powered Vision-Language-Action Model
by: GigaBrain Team, et al.
Published: (2025)

ActionFlow: A Pipelined Action Acceleration for Vision Language Models on Edge
by: Dai, Yuntao, et al.
Published: (2025)

Towards Backdoor-Based Ownership Verification for Vision-Language-Action Models
by: Sun, Ming, et al.
Published: (2026)

World-Value-Action Model: Implicit Planning for Vision-Language-Action Systems
by: Li, Runze, et al.
Published: (2026)

AR-VLA: True Autoregressive Action Expert for Vision-Language-Action Models
by: Hu, Yutong, et al.
Published: (2026)

WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
by: Zhao, Baining, et al.
Published: (2026)

RLinf-VLA: A Unified and Efficient Framework for Reinforcement Learning of Vision-Language-Action Models
by: Zang, Hongzhi, et al.
Published: (2025)

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
by: Hung, Chia-Yu, et al.
Published: (2025)

Neural Implicit Action Fields: From Discrete Waypoints to Continuous Functions for Vision-Language-Action Models
by: Liu, Haoyun, et al.
Published: (2026)

SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models
by: Liu, Haowen, et al.
Published: (2025)

VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model
by: Guo, Yanjiang, et al.
Published: (2026)

RynnVLA-002: A Unified Vision-Language-Action and World Model
by: Cen, Jun, et al.
Published: (2025)

GeoVLA: Empowering 3D Representations in Vision-Language-Action Models
by: Sun, Lin, et al.
Published: (2025)

XR-1: Towards Versatile Vision-Language-Action Models via Learning Unified Vision-Motion Representations
by: Fan, Shichao, et al.
Published: (2025)

VLM-SAFE: Vision-Language Model-Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving
by: Qu, Yansong, et al.
Published: (2025)

ECHO: Continuous Hierarchical Memory for Vision-Language-Action Models
by: Hu, Yanbin, et al.
Published: (2026)

SwitchVLA: Execution-Aware Task Switching for Vision-Language-Action Models
by: Li, Meng, et al.
Published: (2025)

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
by: Wang, Yihao, et al.
Published: (2025)

DriveWorld-VLA: Unified Latent-Space World Modeling with Vision-Language-Action for Autonomous Driving
by: jia, Feiyang, et al.
Published: (2026)

Stable Language Guidance for Vision-Language-Action Models
by: Zhan, Zhihao, et al.
Published: (2026)

WorldVLA: Towards Autoregressive Action World Model
by: Cen, Jun, et al.
Published: (2025)

Dual-Stream Diffusion for World-Model Augmented Vision-Language-Action Model
by: Won, John, et al.
Published: (2025)

V-VLAPS: Value-Guided Planning for Vision-Language-Action Models
by: Ren, Ke, et al.
Published: (2026)

$π_{0.5}$: a Vision-Language-Action Model with Open-World Generalization
by: Intelligence, Physical, et al.
Published: (2025)

Vision-Language-Action Models for Robotics: A Review Towards Real-World Applications
by: Kawaharazuka, Kento, et al.
Published: (2025)