:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Yanbo, Fang, Zipeng, Zhao, Lei, Chen, Weidong
Format:	Preprint
Published:	2025
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.11001
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scene-Agnostic Traversability Labeling and Estimation via a Multimodal Self-supervised Framework
by: Fang, Zipeng, et al.
Published: (2025)

An Efficient LiDAR-Camera Fusion Network for Multi-Class 3D Dynamic Object Detection and Trajectory Prediction
by: He, Yushen, et al.
Published: (2025)

SALT: A Flexible Semi-Automatic Labeling Tool for General LiDAR Point Clouds with Cross-Scene Adaptability and 4D Consistency
by: Wang, Yanbo, et al.
Published: (2025)

LatentPilot: Scene-Aware Vision-and-Language Navigation by Dreaming Ahead with Latent Visual Reasoning
by: Hao, Haihong, et al.
Published: (2026)

Nav-R1: Reasoning and Navigation in Embodied Scenes
by: Liu, Qingxiang, et al.
Published: (2025)

Hierarchical Semantic-Augmented Navigation: Optimal Transport and Graph-Driven Reasoning for Vision-Language Navigation
by: Fang, Xiang, et al.
Published: (2026)

Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes
by: Deng, Tianchen, et al.
Published: (2024)

AwareVLN: Reasoning with Self-awareness for Vision-Language Navigation
by: Guo, Wenxuan, et al.
Published: (2026)

TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation
by: Zhong, Linqing, et al.
Published: (2024)

MM-Nav: Multi-View VLA Model for Robust Visual Navigation via Multi-Expert Learning
by: Xu, Tianyu, et al.
Published: (2025)

SSF-PAN: Semantic Scene Flow-Based Perception for Autonomous Navigation in Traffic Scenarios
by: Chen, Yinqi, et al.
Published: (2025)

Leveraging CVAE for Joint Configuration Estimation of Multifingered Grippers from Point Cloud Data
by: Merand, Julien, et al.
Published: (2025)

NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising
by: Deng, Tianchen, et al.
Published: (2024)

TagaVLM: Topology-Aware Global Action Reasoning for Vision-Language Navigation
by: Liu, Jiaxing, et al.
Published: (2026)

SocialNav-MoE: A Mixture-of-Experts Vision Language Model for Socially Compliant Navigation with Reinforcement Fine-Tuning
by: Kawabata, Tomohito, et al.
Published: (2025)

ReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-Tuning
by: Van Vo, Tuan, et al.
Published: (2026)

Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning
by: Fang, Jiading
Published: (2025)

Collision-Aware Object-Goal Visual Navigation via Two-Stage Deep Reinforcement Learning
by: Wang, Hongwu, et al.
Published: (2025)

SocialNav-SUB: Benchmarking VLMs for Scene Understanding in Social Robot Navigation
by: Munje, Michael J., et al.
Published: (2025)

VL-Nav: A Neuro-Symbolic Approach for Reasoning-based Vision-Language Navigation
by: Du, Yi, et al.
Published: (2025)

DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
by: Wang, Zhaowei, et al.
Published: (2024)

Stop Wandering: Efficient Vision-Language Navigation via Metacognitive Reasoning
by: Li, Xueying, et al.
Published: (2026)

PanoNav: Mapless Zero-Shot Object Navigation with Panoramic Scene Parsing and Dynamic Memory
by: Jin, Qunchao, et al.
Published: (2025)

Context-Nav: Context-Driven Exploration and Viewpoint-Aware 3D Spatial Reasoning for Instance Navigation
by: Jang, Won Shik, et al.
Published: (2026)

Overlap-Aware Feature Learning for Robust Unsupervised Domain Adaptation for 3D Semantic Segmentation
by: Chen, Junjie, et al.
Published: (2025)

MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
by: Huang, Xun, et al.
Published: (2025)

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
by: Han, Mingfei, et al.
Published: (2024)

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
by: Lin, Bingqian, et al.
Published: (2024)

DriveCode: Domain Specific Numerical Encoding for LLM-Based Autonomous Driving
by: Wang, Zhiye, et al.
Published: (2026)

Semantic Enrichment of CAD-Based Industrial Environments via Scene Graphs for Simulation and Reasoning
by: Walus, Nathan Pascal, et al.
Published: (2026)

Have We Scene It All? Scene Graph-Aware Deep Point Cloud Compression
by: Stathoulopoulos, Nikolaos, et al.
Published: (2025)

AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
by: Xiong, Chuyan, et al.
Published: (2024)

SG-DOR: Learning Scene Graphs with Direction-Conditioned Occlusion Reasoning for Pepper Plants
by: Menon, Rohit, et al.
Published: (2026)

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning
by: Lin, Bingqian, et al.
Published: (2024)

RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Visual Contextual Adaptation
by: Yu, Ming-Ming, et al.
Published: (2025)

UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving
by: Min, Chen, et al.
Published: (2023)

CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning
by: Gan, Rui, et al.
Published: (2026)

DualVLA: Building a Generalizable Embodied Agent via Partial Decoupling of Reasoning and Action
by: Fang, Zhen, et al.
Published: (2025)

PACA: Perspective-Aware Cross-Attention Representation for Zero-Shot Scene Rearrangement
by: Jin, Shutong, et al.
Published: (2024)

What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
by: Deng, Tianchen, et al.
Published: (2025)