:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Zhang, Yifei, Zhao, Hao, Li, Hongyang, Chen, Siheng
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence Robotics
Accesso online:	https://arxiv.org/abs/2403.08770
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting
di: Fujii, Ryo, et al.
Pubblicazione: (2024)

Towards Long-horizon Embodied Agents with Tool-Aligned Vision-Language-Action Models
di: Lei, Zixing, et al.
Pubblicazione: (2026)

Interruption-Aware Cooperative Perception for V2X Communication-Aided Autonomous Driving
di: Ren, Shunli, et al.
Pubblicazione: (2023)

CoFiI2P: Coarse-to-Fine Correspondences for Image-to-Point Cloud Registration
di: Kang, Shuhao, et al.
Pubblicazione: (2023)

WholeBodyVLA: Towards Unified Latent VLA for Whole-Body Loco-Manipulation Control
di: Jiang, Haoran, et al.
Pubblicazione: (2025)

Tether: Autonomous Functional Play with Correspondence-Driven Trajectory Warping
di: Liang, William, et al.
Pubblicazione: (2026)

MimicFunc: Imitating Tool Manipulation from a Single Human Video via Functional Correspondence
di: Tang, Chao, et al.
Pubblicazione: (2025)

OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments
di: Deng, Yinan, et al.
Pubblicazione: (2024)

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
di: Guo, Jun, et al.
Pubblicazione: (2026)

Learning from Massive Human Videos for Universal Humanoid Pose Control
di: Mao, Jiageng, et al.
Pubblicazione: (2024)

Leveraging Unknown Objects to Construct Labeled-Unlabeled Meta-Relationships for Zero-Shot Object Navigation
di: Zheng, Yanwei, et al.
Pubblicazione: (2024)

Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images
di: Adrian, David B., et al.
Pubblicazione: (2024)

Into the Unknown: Towards using Generative Models for Sampling Priors of Environment Uncertainty for Planning in Configuration Spaces
di: Bhattacharjee, Subhransu S., et al.
Pubblicazione: (2025)

End-to-end Autonomous Driving: Challenges and Frontiers
di: Chen, Li, et al.
Pubblicazione: (2023)

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
di: Jiang, Guangqi, et al.
Pubblicazione: (2024)

Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
di: Zhang, Pingrui, et al.
Pubblicazione: (2025)

SCENES: Subpixel Correspondence Estimation With Epipolar Supervision
di: Kloepfer, Dominik A., et al.
Pubblicazione: (2024)

RAG-3DSG: Enhancing 3D Scene Graphs with Re-Shot Guided Retrieval-Augmented Generation
di: Chang, Yue, et al.
Pubblicazione: (2026)

CL3R: 3D Reconstruction and Contrastive Learning for Enhanced Robotic Manipulation Representations
di: Cui, Wenbo, et al.
Pubblicazione: (2025)

SpatialNav: Leveraging Spatial Scene Graphs for Zero-Shot Vision-and-Language Navigation
di: Zhang, Jiwen, et al.
Pubblicazione: (2026)

Driver-WM: A Driver-Centric Traffic-Conditioned Latent World Model for In-Cabin Dynamics Rollout
di: Chi, Haozhuang, et al.
Pubblicazione: (2026)

ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
di: Li, Ying, et al.
Pubblicazione: (2025)

Visual SLAMMOT Considering Multiple Motion Models
di: Tian, Peilin, et al.
Pubblicazione: (2024)

Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
di: Dong, Yifei, et al.
Pubblicazione: (2025)

Image-Goal Navigation Using Refined Feature Guidance and Scene Graph Enhancement
di: Feng, Zhicheng, et al.
Pubblicazione: (2025)

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions
di: Li, Heng, et al.
Pubblicazione: (2024)

DiffVLA: Vision-Language Guided Diffusion Planning for Autonomous Driving
di: Jiang, Anqing, et al.
Pubblicazione: (2025)

Bridging Spectral-wise and Multi-spectral Depth Estimation via Geometry-guided Contrastive Learning
di: Shin, Ukcheol, et al.
Pubblicazione: (2025)

Fast maneuver recovery from aerial observation: trajectory clustering and outliers rejection
di: de Moura, Nelson, et al.
Pubblicazione: (2024)

FreDSNet: Joint Monocular Depth and Semantic Segmentation with Fast Fourier Convolutions
di: Berenguel-Baeta, Bruno, et al.
Pubblicazione: (2022)

LatentPilot: Scene-Aware Vision-and-Language Navigation by Dreaming Ahead with Latent Visual Reasoning
di: Hao, Haihong, et al.
Pubblicazione: (2026)

Chain of World: World Model Thinking in Latent Motion
di: Yang, Fuxiang, et al.
Pubblicazione: (2026)

World-Ego Modeling for Long-Horizon Evolution in Hybrid Embodied Tasks
di: Lin, Zuyao, et al.
Pubblicazione: (2026)

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
di: Yang, Jianing, et al.
Pubblicazione: (2025)

Drive-P2D: A Progressive Perception-to-Decision Benchmark for VLMs in Autonomous Driving
di: Tang, Zecong, et al.
Pubblicazione: (2026)

HA-VLN 2.0: An Open Benchmark and Leaderboard for Human-Aware Navigation in Discrete and Continuous Environments with Dynamic Multi-Human Interactions
di: Dong, Yifei, et al.
Pubblicazione: (2025)

FunGraph: Functionality Aware 3D Scene Graphs for Language-Prompted Scene Interaction
di: Rotondi, Dennis, et al.
Pubblicazione: (2025)

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI
di: Zhong, Fangwei, et al.
Pubblicazione: (2024)

PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos
di: Jiang, Hanxiao, et al.
Pubblicazione: (2025)

EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
di: Zhang, Yi, et al.
Pubblicazione: (2025)