Saved in:
| Main Authors: | Zeng, Zijian, Ding, Fei, Yang, Huiming, Li, Xianwei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.18791 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DexSim2Real: Foundation Model-Guided Sim-to-Real Transfer for Generalizable Dexterous Manipulation
by: Zeng, Zijian, et al.
Published: (2026)
by: Zeng, Zijian, et al.
Published: (2026)
Rethinking the Comparison Unit in Sequence-Level Reinforcement Learning: An Equal-Length Paired Training Framework from Loss Correction to Sample Construction
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
by: He, Neil, et al.
Published: (2025)
by: He, Neil, et al.
Published: (2025)
Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
villa-X: Enhancing Latent Action Modeling in Vision-Language-Action Models
by: Chen, Xiaoyu, et al.
Published: (2025)
by: Chen, Xiaoyu, et al.
Published: (2025)
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
by: Yin, Cheng, et al.
Published: (2025)
by: Yin, Cheng, et al.
Published: (2025)
Reducing Credit Assignment Variance via Counterfactual Reasoning Paths
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
$\boldsymbol{f}$-OPD: Stabilizing Long-Horizon On-Policy Distillation with Freshness-Aware Control
by: Chen, Xianwei, et al.
Published: (2026)
by: Chen, Xianwei, et al.
Published: (2026)
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation
by: Feng, Yunhai, et al.
Published: (2025)
by: Feng, Yunhai, et al.
Published: (2025)
CRL-VLA: Continual Vision-Language-Action Learning
by: Zeng, Qixin, et al.
Published: (2026)
by: Zeng, Qixin, et al.
Published: (2026)
SOLAR-RL: Semi-Online Long-horizon Assignment Reinforcement Learning
by: Wang, Jichao, et al.
Published: (2026)
by: Wang, Jichao, et al.
Published: (2026)
AsyncVLA: Asynchronous Flow Matching for Vision-Language-Action Models
by: Jiang, Yuhua, et al.
Published: (2025)
by: Jiang, Yuhua, et al.
Published: (2025)
StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation
by: Chen, Zhiyuan, et al.
Published: (2026)
by: Chen, Zhiyuan, et al.
Published: (2026)
Toward Accurate Long-Horizon Robotic Manipulation: Language-to-Action with Foundation Models via Scene Graphs
by: Dinesh, Sushil Samuel, et al.
Published: (2025)
by: Dinesh, Sushil Samuel, et al.
Published: (2025)
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
by: Suo, Yucheng, et al.
Published: (2025)
by: Suo, Yucheng, et al.
Published: (2025)
Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
by: Li, Qixiu, et al.
Published: (2025)
by: Li, Qixiu, et al.
Published: (2025)
Continuous Reasoning for Vision-Language-Action
by: Wu, Yueh-Hua, et al.
Published: (2026)
by: Wu, Yueh-Hua, et al.
Published: (2026)
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections
by: Zha, Lihan, et al.
Published: (2023)
by: Zha, Lihan, et al.
Published: (2023)
ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping
by: Zhu, Zijian, et al.
Published: (2026)
by: Zhu, Zijian, et al.
Published: (2026)
TimeCapsule: Solving the Jigsaw Puzzle of Long-Term Time Series Forecasting with Compressed Predictive Representations
by: Lu, Yihang, et al.
Published: (2025)
by: Lu, Yihang, et al.
Published: (2025)
Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models
by: Liu, Jinyi, et al.
Published: (2024)
by: Liu, Jinyi, et al.
Published: (2024)
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
by: Li, Qixiu, et al.
Published: (2024)
by: Li, Qixiu, et al.
Published: (2024)
Towards General Continuous Memory for Vision-Language Models
by: Wu, Wenyi, et al.
Published: (2025)
by: Wu, Wenyi, et al.
Published: (2025)
Recurrent Action Transformer with Memory
by: Cherepanov, Egor, et al.
Published: (2023)
by: Cherepanov, Egor, et al.
Published: (2023)
Chain-of-Modality: Learning Manipulation Programs from Multimodal Human Videos with Vision-Language-Models
by: Wang, Chen, et al.
Published: (2025)
by: Wang, Chen, et al.
Published: (2025)
Provably Efficient Action-Manipulation Attack Against Continuous Reinforcement Learning
by: Luo, Zhi, et al.
Published: (2024)
by: Luo, Zhi, et al.
Published: (2024)
Enhancing Generalization in Vision-Language-Action Models by Preserving Pretrained Representations
by: Grover, Shresth, et al.
Published: (2025)
by: Grover, Shresth, et al.
Published: (2025)
HyperVLA: Efficient Inference in Vision-Language-Action Models via Hypernetworks
by: Xiong, Zheng, et al.
Published: (2025)
by: Xiong, Zheng, et al.
Published: (2025)
Understanding Asynchronous Inference Methods for Vision-Language-Action Models
by: Agouzoul, Ayoub
Published: (2026)
by: Agouzoul, Ayoub
Published: (2026)
Jump-Start Reinforcement Learning with Vision-Language-Action Regularization
by: Moroncelli, Angelo, et al.
Published: (2026)
by: Moroncelli, Angelo, et al.
Published: (2026)
Branch-and-Browse: Efficient and Controllable Web Exploration with Tree-Structured Reasoning and Action Memory
by: He, Shiqi, et al.
Published: (2025)
by: He, Shiqi, et al.
Published: (2025)
Do What You Say: Steering Vision-Language-Action Models via Runtime Reasoning-Action Alignment Verification
by: Wu, Yilin, et al.
Published: (2025)
by: Wu, Yilin, et al.
Published: (2025)
Spec-VLA: Speculative Decoding for Vision-Language-Action Models with Relaxed Acceptance
by: Wang, Songsheng, et al.
Published: (2025)
by: Wang, Songsheng, et al.
Published: (2025)
Autoregressive Action Sequence Learning for Robotic Manipulation
by: Zhang, Xinyu, et al.
Published: (2024)
by: Zhang, Xinyu, et al.
Published: (2024)
Towards Harnessing the Collaborative Power of Large and Small Models for Domain Tasks
by: Liu, Yang, et al.
Published: (2025)
by: Liu, Yang, et al.
Published: (2025)
Offline Policy Learning via Skill-step Abstraction for Long-horizon Goal-Conditioned Tasks
by: Kim, Donghoon, et al.
Published: (2024)
by: Kim, Donghoon, et al.
Published: (2024)
A Survey on Efficient Vision-Language-Action Models
by: Yu, Zhaoshu, et al.
Published: (2025)
by: Yu, Zhaoshu, et al.
Published: (2025)
On the Reproducibility of "FairCLIP: Harnessing Fairness in Vision-Language Learning''
by: Bakker, Hua Chang, et al.
Published: (2025)
by: Bakker, Hua Chang, et al.
Published: (2025)
Harnessing Vision-Language Models for Time Series Anomaly Detection
by: He, Zelin, et al.
Published: (2025)
by: He, Zelin, et al.
Published: (2025)
Similar Items
-
DexSim2Real: Foundation Model-Guided Sim-to-Real Transfer for Generalizable Dexterous Manipulation
by: Zeng, Zijian, et al.
Published: (2026) -
Rethinking the Comparison Unit in Sequence-Level Reinforcement Learning: An Equal-Length Paired Training Framework from Loss Correction to Sample Construction
by: Ding, Fei, et al.
Published: (2026) -
Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning
by: Ding, Fei, et al.
Published: (2026) -
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
by: He, Neil, et al.
Published: (2025) -
Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation
by: Ding, Fei, et al.
Published: (2026)