Saved in:
| Main Authors: | Liu, Dayong, Xu, Chao, Chen, Weihong, Zhang, Suyu, Wang, Juncheng, Deng, Jiankang, Sun, Baigui, Liu, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.18685 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Embodied Navigation with Auxiliary Task of Action Description Prediction
by: Kondoh, Haru, et al.
Published: (2025)
by: Kondoh, Haru, et al.
Published: (2025)
Robo-Cortex: A Self-Evolving Embodied Agent via Dual-Grain Cognitive Memory and Autonomous Knowledge Induction
by: Chan, Nga Teng, et al.
Published: (2026)
by: Chan, Nga Teng, et al.
Published: (2026)
RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
by: Liu, Enguang, et al.
Published: (2025)
by: Liu, Enguang, et al.
Published: (2025)
Fine-Grained Action Segmentation for Renorrhaphy in Robot-Assisted Partial Nephrectomy
by: Dai, Jiaheng, et al.
Published: (2026)
by: Dai, Jiaheng, et al.
Published: (2026)
The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents
by: Yeke, Doguhan, et al.
Published: (2026)
by: Yeke, Doguhan, et al.
Published: (2026)
RoboTidy : A 3D Gaussian Splatting Household Tidying Benchmark for Embodied Navigation and Action
by: Sun, Xiaoquan, et al.
Published: (2025)
by: Sun, Xiaoquan, et al.
Published: (2025)
DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action
by: Fang, Zhen, et al.
Published: (2025)
by: Fang, Zhen, et al.
Published: (2025)
From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation
by: Li, Yajie, et al.
Published: (2026)
by: Li, Yajie, et al.
Published: (2026)
VLNVerse: A Benchmark for Vision-Language Navigation with Versatile, Embodied, Realistic Simulation and Evaluation
by: Lin, Sihao, et al.
Published: (2025)
by: Lin, Sihao, et al.
Published: (2025)
TransFace++: Rethinking the Face Recognition Paradigm with a Focus on Accuracy, Efficiency, and Security
by: Dan, Jun, et al.
Published: (2023)
by: Dan, Jun, et al.
Published: (2023)
FindingDory: A Benchmark to Evaluate Memory in Embodied Agents
by: Yadav, Karmesh, et al.
Published: (2025)
by: Yadav, Karmesh, et al.
Published: (2025)
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
by: Liu, Youzhi, et al.
Published: (2024)
by: Liu, Youzhi, et al.
Published: (2024)
An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
by: Xu, Chao, et al.
Published: (2025)
by: Xu, Chao, et al.
Published: (2025)
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
by: Galliena, Tommaso, et al.
Published: (2025)
by: Galliena, Tommaso, et al.
Published: (2025)
Universal Actions for Enhanced Embodied Foundation Models
by: Zheng, Jinliang, et al.
Published: (2025)
by: Zheng, Jinliang, et al.
Published: (2025)
PersONAL: Towards a Comprehensive Benchmark for Personalized Embodied Agents
by: Ziliotto, Filippo, et al.
Published: (2025)
by: Ziliotto, Filippo, et al.
Published: (2025)
Guide, Think, Act: Interactive Embodied Reasoning in Vision-Language-Action Models
by: Ling, Yiran, et al.
Published: (2026)
by: Ling, Yiran, et al.
Published: (2026)
VAG: Dual-Stream Video-Action Generation for Embodied Data Synthesis
by: Lang, Xiaolei, et al.
Published: (2026)
by: Lang, Xiaolei, et al.
Published: (2026)
Towards Long-horizon Embodied Agents with Tool-Aligned Vision-Language-Action Models
by: Lei, Zixing, et al.
Published: (2026)
by: Lei, Zixing, et al.
Published: (2026)
Uni-LaViRA: Language-Vision-Robot Actions Translation for Unified Embodied Navigation
by: Ding, Hongyu, et al.
Published: (2026)
by: Ding, Hongyu, et al.
Published: (2026)
EnerVerse-AC: Envisioning Embodied Environments with Action Condition
by: Jiang, Yuxin, et al.
Published: (2025)
by: Jiang, Yuxin, et al.
Published: (2025)
Embodied3DBench: Benchmarking Low-Level Embodied Spatial Intelligence of Vision Language Models
by: Zhang, Jiyao, et al.
Published: (2026)
by: Zhang, Jiyao, et al.
Published: (2026)
TryOn-Adapter: Efficient Fine-Grained Clothing Identity Adaptation for High-Fidelity Virtual Try-On
by: Xing, Jiazheng, et al.
Published: (2024)
by: Xing, Jiazheng, et al.
Published: (2024)
FineCog-Nav: Integrating Fine-grained Cognitive Modules for Zero-shot Multimodal UAV Navigation
by: Shao, Dian, et al.
Published: (2026)
by: Shao, Dian, et al.
Published: (2026)
Embodied Agents for Efficient Exploration and Smart Scene Description
by: Bigazzi, Roberto, et al.
Published: (2023)
by: Bigazzi, Roberto, et al.
Published: (2023)
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
by: Xiao, Jianqiang, et al.
Published: (2025)
by: Xiao, Jianqiang, et al.
Published: (2025)
MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
by: Shi, Hao, et al.
Published: (2025)
by: Shi, Hao, et al.
Published: (2025)
RoboAgent: Chaining Basic Capabilities for Embodied Task Planning
by: Xu, Peiran, et al.
Published: (2026)
by: Xu, Peiran, et al.
Published: (2026)
RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic
by: Wang, Le, et al.
Published: (2025)
by: Wang, Le, et al.
Published: (2025)
World Action Models: The Next Frontier in Embodied AI
by: Wang, Siyin, et al.
Published: (2026)
by: Wang, Siyin, et al.
Published: (2026)
Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication
by: Sun, Mingze, et al.
Published: (2024)
by: Sun, Mingze, et al.
Published: (2024)
Beyond Binary Success: A Diagnostic Meta-Evaluation Framework for Fine-Grained Manipulation
by: Xu, He-Yang, et al.
Published: (2026)
by: Xu, He-Yang, et al.
Published: (2026)
UAV-Track VLA: Embodied Aerial Tracking via Vision-Language-Action Models
by: Zhang, Qiyao, et al.
Published: (2026)
by: Zhang, Qiyao, et al.
Published: (2026)
Arcadia: Toward a Full-Lifecycle Framework for Embodied Lifelong Learning
by: Gao, Minghe, et al.
Published: (2025)
by: Gao, Minghe, et al.
Published: (2025)
Expand Your SCOPE: Semantic Cognition over Potential-Based Exploration for Embodied Visual Navigation
by: Wang, Ningnan, et al.
Published: (2025)
by: Wang, Ningnan, et al.
Published: (2025)
EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence
by: Wang, Xinjie, et al.
Published: (2025)
by: Wang, Xinjie, et al.
Published: (2025)
Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain
by: Luo, Yulin, et al.
Published: (2025)
by: Luo, Yulin, et al.
Published: (2025)
From Scan to Action: Leveraging Realistic Scans for Embodied Scene Understanding
by: Halacheva, Anna-Maria, et al.
Published: (2025)
by: Halacheva, Anna-Maria, et al.
Published: (2025)
Uni-NaVid: A Video-based Vision-Language-Action Model for Unifying Embodied Navigation Tasks
by: Zhang, Jiazhao, et al.
Published: (2024)
by: Zhang, Jiazhao, et al.
Published: (2024)
SKIP: Sparse Keyframe Interpolation Paradigm for Efficient Embodied World Models
by: He, Ziheng, et al.
Published: (2026)
by: He, Ziheng, et al.
Published: (2026)
Similar Items
-
Embodied Navigation with Auxiliary Task of Action Description Prediction
by: Kondoh, Haru, et al.
Published: (2025) -
Robo-Cortex: A Self-Evolving Embodied Agent via Dual-Grain Cognitive Memory and Autonomous Knowledge Induction
by: Chan, Nga Teng, et al.
Published: (2026) -
RoboView-Bias: Benchmarking Visual Bias in Embodied Agents for Robotic Manipulation
by: Liu, Enguang, et al.
Published: (2025) -
Fine-Grained Action Segmentation for Renorrhaphy in Robot-Assisted Partial Nephrectomy
by: Dai, Jiaheng, et al.
Published: (2026) -
The Yes-Man Syndrome: Benchmarking Abstention in Embodied Robotic Agents
by: Yeke, Doguhan, et al.
Published: (2026)