Guardado en:
| Autores principales: | Ma, Liang, Wen, Jiajun, Lin, Min, Xu, Rongtao, Liang, Xiwen, Lin, Bingqian, Ma, Jun, Wang, Yongxin, Wei, Ziming, Lin, Haokun, Han, Mingfei, Cao, Meng, Chen, Bokui, Laptev, Ivan, Liang, Xiaodan |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2506.08708 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Structured Preference Optimization for Vision-Language Long-Horizon Task Planning
por: Liang, Xiwen, et al.
Publicado: (2025)
por: Liang, Xiwen, et al.
Publicado: (2025)
EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
por: Wang, Yongxin, et al.
Publicado: (2024)
por: Wang, Yongxin, et al.
Publicado: (2024)
EvolveNav: Empowering LLM-Based Vision-Language Navigation via Self-Improving Embodied Reasoning
por: Lin, Bingqian, et al.
Publicado: (2025)
por: Lin, Bingqian, et al.
Publicado: (2025)
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents
por: Wei, Ziming, et al.
Publicado: (2025)
por: Wei, Ziming, et al.
Publicado: (2025)
CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning
por: Wang, Yongxin, et al.
Publicado: (2025)
por: Wang, Yongxin, et al.
Publicado: (2025)
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation
por: Chen, Jiaqi, et al.
Publicado: (2024)
por: Chen, Jiaqi, et al.
Publicado: (2024)
Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation
por: Lin, Bingqian, et al.
Publicado: (2023)
por: Lin, Bingqian, et al.
Publicado: (2023)
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
por: Han, Mingfei, et al.
Publicado: (2024)
por: Han, Mingfei, et al.
Publicado: (2024)
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
por: Wei, Ziming, et al.
Publicado: (2025)
por: Wei, Ziming, et al.
Publicado: (2025)
ActionSink: Toward Precise Robot Manipulation with Dynamic Integration of Action Flow
por: Guo, Shanshan, et al.
Publicado: (2025)
por: Guo, Shanshan, et al.
Publicado: (2025)
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation
por: Liang, Xiwen, et al.
Publicado: (2023)
por: Liang, Xiwen, et al.
Publicado: (2023)
GLaD: Geometric Latent Distillation for Vision-Language-Action Models
por: Guo, Minghao, et al.
Publicado: (2025)
por: Guo, Minghao, et al.
Publicado: (2025)
Implicit Geometry Representations for Vision-and-Language Navigation from Web Videos
por: Han, Mingfei, et al.
Publicado: (2026)
por: Han, Mingfei, et al.
Publicado: (2026)
Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling
por: Cao, Meng, et al.
Publicado: (2025)
por: Cao, Meng, et al.
Publicado: (2025)
Correctable Landmark Discovery via Large Models for Vision-Language Navigation
por: Lin, Bingqian, et al.
Publicado: (2024)
por: Lin, Bingqian, et al.
Publicado: (2024)
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
por: Zhang, Kaidong, et al.
Publicado: (2025)
por: Zhang, Kaidong, et al.
Publicado: (2025)
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
por: Zhang, Kaidong, et al.
Publicado: (2024)
por: Zhang, Kaidong, et al.
Publicado: (2024)
EchoVLA: Synergistic Declarative Memory for VLA-Driven Mobile Manipulation
por: Lin, Min, et al.
Publicado: (2025)
por: Lin, Min, et al.
Publicado: (2025)
Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
por: Lin, Bingqian, et al.
Publicado: (2024)
por: Lin, Bingqian, et al.
Publicado: (2024)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
por: Wang, Guangrun, et al.
Publicado: (2024)
por: Wang, Guangrun, et al.
Publicado: (2024)
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
por: Lin, Bingqian, et al.
Publicado: (2024)
por: Lin, Bingqian, et al.
Publicado: (2024)
RePO-VLA: Recovery-Driven Policy Optimization for Vision-Language-Action Models
por: Liufu, Weijia, et al.
Publicado: (2026)
por: Liufu, Weijia, et al.
Publicado: (2026)
RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
por: Nie, Yunshuang, et al.
Publicado: (2026)
por: Nie, Yunshuang, et al.
Publicado: (2026)
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation
por: Chen, Jiaqi, et al.
Publicado: (2024)
por: Chen, Jiaqi, et al.
Publicado: (2024)
Controllable Self‐Assembly Morphologies of PPV‐Based Block Copolymers
por: Liang Han, et al.
Publicado: (2025)
por: Liang Han, et al.
Publicado: (2025)
Contrastive Learning with Counterfactual Explanations for Radiology Report Generation
por: Li, Mingjie, et al.
Publicado: (2024)
por: Li, Mingjie, et al.
Publicado: (2024)
Programmable In‐Situ Co‐Assembly of Organic Multi‐Block Nanowires for Cascade Optical Waveguides
por: Shuai Zhao, et al.
Publicado: (2024)
por: Shuai Zhao, et al.
Publicado: (2024)
World2Act: Latent Action Post-Training from World Model Dynamics
por: Vuong, An Dinh, et al.
Publicado: (2026)
por: Vuong, An Dinh, et al.
Publicado: (2026)
ProPhy: Progressive Physical Alignment for Dynamic World Simulation
por: Wang, Zijun, et al.
Publicado: (2025)
por: Wang, Zijun, et al.
Publicado: (2025)
Mechanical Properties and In Vitro Degradation Study of Modified‐Polylactic Acid Block Copolymers/Polyetheretherketone Biocomposites
por: Meng Shi, et al.
Publicado: (2024)
por: Meng Shi, et al.
Publicado: (2024)
Choose What to Observe: Task-Aware Semantic-Geometric Representations for Visuomotor Policy
por: Ding, Haoran, et al.
Publicado: (2026)
por: Ding, Haoran, et al.
Publicado: (2026)
ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation
por: Sun, Yu, et al.
Publicado: (2026)
por: Sun, Yu, et al.
Publicado: (2026)
Efficient Training of Large Vision Models via Advanced Automated Progressive Learning
por: Li, Changlin, et al.
Publicado: (2024)
por: Li, Changlin, et al.
Publicado: (2024)
Design of Sprint Mechanical Auxiliary Starting Block Based on the Dynamic Simulation of Energy‐Release Efficiency
por: Liang Liang
Publicado: (2025)
por: Liang Liang
Publicado: (2025)
Temporal Action Detection Model Compression by Progressive Block Drop
por: Chen, Xiaoyong, et al.
Publicado: (2025)
por: Chen, Xiaoyong, et al.
Publicado: (2025)
Block-Weighted Lasso for Joint Optimization of Memory Depth and Kernels in Wideband DPD
por: Wang, Jinfei, et al.
Publicado: (2025)
por: Wang, Jinfei, et al.
Publicado: (2025)
SimuScene: Training and Benchmarking Code Generation to Simulate Physical Scenarios
por: Wang, Yanan, et al.
Publicado: (2026)
por: Wang, Yanan, et al.
Publicado: (2026)
Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
por: Lou, Siyu, et al.
Publicado: (2024)
por: Lou, Siyu, et al.
Publicado: (2024)
InstruGen: Automatic Instruction Generation for Vision-and-Language Navigation Via Large Multimodal Models
por: Yan, Yu, et al.
Publicado: (2024)
por: Yan, Yu, et al.
Publicado: (2024)
Confined Monopoles in Chiral Bag
por: Lin, Fan, et al.
Publicado: (2025)
por: Lin, Fan, et al.
Publicado: (2025)
Ejemplares similares
-
Structured Preference Optimization for Vision-Language Long-Horizon Task Planning
por: Liang, Xiwen, et al.
Publicado: (2025) -
EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
por: Wang, Yongxin, et al.
Publicado: (2024) -
EvolveNav: Empowering LLM-Based Vision-Language Navigation via Self-Improving Embodied Reasoning
por: Lin, Bingqian, et al.
Publicado: (2025) -
MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents
por: Wei, Ziming, et al.
Publicado: (2025) -
CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal Reasoning
por: Wang, Yongxin, et al.
Publicado: (2025)