Saved in:
| Main Authors: | Song, Xurui, Huai, Shuo, Jiang, JingJing, Kong, Jiayi, Luo, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.04532 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Dagger Behind Smile: Fool LLMs with a Happy Ending Story
by: Song, Xurui, et al.
Published: (2025)
by: Song, Xurui, et al.
Published: (2025)
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
by: Lu, Jinghui, et al.
Published: (2026)
by: Lu, Jinghui, et al.
Published: (2026)
Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving
by: Zhao, Kangqiao, et al.
Published: (2025)
by: Zhao, Kangqiao, et al.
Published: (2025)
AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
by: Qian, Kangan, et al.
Published: (2025)
by: Qian, Kangan, et al.
Published: (2025)
Do Not DeepFake Me: Privacy-Preserving Neural 3D Head Reconstruction Without Sensitive Images
by: Kong, Jiayi, et al.
Published: (2023)
by: Kong, Jiayi, et al.
Published: (2023)
Vision-Language Interpreter for Robot Task Planning
by: Shirai, Keisuke, et al.
Published: (2023)
by: Shirai, Keisuke, et al.
Published: (2023)
More Than Meets the Eye: Measuring the Semiotic Gap in Vision-Language Models via Semantic Anchorage
by: He, Wei
Published: (2026)
by: He, Wei
Published: (2026)
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation
by: Chen, Jiaqi, et al.
Published: (2024)
by: Chen, Jiaqi, et al.
Published: (2024)
Discrete Diffusion for Reflective Vision-Language-Action Models in Autonomous Driving
by: Li, Pengxiang, et al.
Published: (2025)
by: Li, Pengxiang, et al.
Published: (2025)
REFLEX: Metacognitive Reasoning for Reflective Zero-Shot Robotic Planning with Large Language Models
by: Lin, Wenjie, et al.
Published: (2025)
by: Lin, Wenjie, et al.
Published: (2025)
A Superalignment Framework in Autonomous Driving with Large Language Models
by: Kong, Xiangrui, et al.
Published: (2024)
by: Kong, Xiangrui, et al.
Published: (2024)
Where to Start Alignment? Diffusion Large Language Model May Demand a Distinct Position
by: Xie, Zhixin, et al.
Published: (2025)
by: Xie, Zhixin, et al.
Published: (2025)
SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation
by: Pham, Quang P. M., et al.
Published: (2025)
by: Pham, Quang P. M., et al.
Published: (2025)
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning
by: Kong, Yangzhe, et al.
Published: (2025)
by: Kong, Yangzhe, et al.
Published: (2025)
ROVER: Recursive Reasoning Over Videos with Vision-Language Models for Embodied Tasks
by: Schroeder, Philip, et al.
Published: (2025)
by: Schroeder, Philip, et al.
Published: (2025)
Statler: State-Maintaining Language Models for Embodied Reasoning
by: Yoneda, Takuma, et al.
Published: (2023)
by: Yoneda, Takuma, et al.
Published: (2023)
Instruct Large Language Models to Drive like Humans
by: Zhang, Ruijun, et al.
Published: (2024)
by: Zhang, Ruijun, et al.
Published: (2024)
OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models
by: Kuang, Yuxuan, et al.
Published: (2024)
by: Kuang, Yuxuan, et al.
Published: (2024)
X-Driver: Explainable Autonomous Driving with Vision-Language Models
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
by: An, Dong, et al.
Published: (2023)
by: An, Dong, et al.
Published: (2023)
Stable Language Guidance for Vision-Language-Action Models
by: Zhan, Zhihao, et al.
Published: (2026)
by: Zhan, Zhihao, et al.
Published: (2026)
Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future
by: Hu, Tianshuai, et al.
Published: (2025)
by: Hu, Tianshuai, et al.
Published: (2025)
DSDrive: Distilling Large Language Model for Lightweight End-to-End Autonomous Driving with Unified Reasoning and Planning
by: Liu, Wenru, et al.
Published: (2025)
by: Liu, Wenru, et al.
Published: (2025)
Perceptive Variable-Timing Footstep Planning for Humanoid Locomotion on Disconnected Footholds
by: Xiang, Zhaoyang, et al.
Published: (2026)
by: Xiang, Zhaoyang, et al.
Published: (2026)
More Than Meets the Eye
by: Rehak, Bob
Published: (2024)
by: Rehak, Bob
Published: (2024)
AutoDrive-R$^2$: Incentivizing Reasoning and Self-Reflection Capacity for VLA Model in Autonomous Driving
by: Yuan, Zhenlong, et al.
Published: (2025)
by: Yuan, Zhenlong, et al.
Published: (2025)
Seeing before Observable: Potential Risk Reasoning in Autonomous Driving via Vision Language Models
by: Liu, Jiaxin, et al.
Published: (2025)
by: Liu, Jiaxin, et al.
Published: (2025)
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models
by: Wen, Licheng, et al.
Published: (2023)
by: Wen, Licheng, et al.
Published: (2023)
EdgeVLA: Efficient Vision-Language-Action Models
by: Budzianowski, Paweł, et al.
Published: (2025)
by: Budzianowski, Paweł, et al.
Published: (2025)
Long Is More Important Than Difficult for Training Reasoning Models
by: Shen, Si, et al.
Published: (2025)
by: Shen, Si, et al.
Published: (2025)
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
by: Sha, Hao, et al.
Published: (2023)
by: Sha, Hao, et al.
Published: (2023)
Driving Everywhere with Large Language Model Policy Adaptation
by: Li, Boyi, et al.
Published: (2024)
by: Li, Boyi, et al.
Published: (2024)
End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering
by: Goetting, Dylan, et al.
Published: (2024)
by: Goetting, Dylan, et al.
Published: (2024)
FASIONAD++ : Integrating High-Level Instruction and Information Bottleneck in FAt-Slow fusION Systems for Enhanced Safety in Autonomous Driving with Adaptive Feedback
by: Qian, Kangan, et al.
Published: (2025)
by: Qian, Kangan, et al.
Published: (2025)
SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
by: Wu, Yi, et al.
Published: (2024)
by: Wu, Yi, et al.
Published: (2024)
ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models
by: Darabi, Nastaran, et al.
Published: (2026)
by: Darabi, Nastaran, et al.
Published: (2026)
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
by: Chen, Boyuan, et al.
Published: (2024)
by: Chen, Boyuan, et al.
Published: (2024)
FloorPlan-VLN: A New Paradigm for Floor Plan Guided Vision-Language Navigation
by: Chen, Kehan, et al.
Published: (2026)
by: Chen, Kehan, et al.
Published: (2026)
PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models
by: Zhu, Wang Bill, et al.
Published: (2025)
by: Zhu, Wang Bill, et al.
Published: (2025)
LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
by: Liu, Yudong, et al.
Published: (2025)
by: Liu, Yudong, et al.
Published: (2025)
Similar Items
-
Dagger Behind Smile: Fool LLMs with a Happy Ending Story
by: Song, Xurui, et al.
Published: (2025) -
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
by: Lu, Jinghui, et al.
Published: (2026) -
Cheating Stereo Matching in Full-scale: Physical Adversarial Attack against Binocular Depth Estimation in Autonomous Driving
by: Zhao, Kangqiao, et al.
Published: (2025) -
AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
by: Qian, Kangan, et al.
Published: (2025) -
Do Not DeepFake Me: Privacy-Preserving Neural 3D Head Reconstruction Without Sensitive Images
by: Kong, Jiayi, et al.
Published: (2023)