Saved in:
| Main Authors: | Byun, Ye Won, Jiao, Cathy, Noroozizadeh, Shahriar, Sun, Jimin, Vitiello, Rosa |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.17876 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
by: Wang, Zhaowei, et al.
Published: (2024)
by: Wang, Zhaowei, et al.
Published: (2024)
DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects
by: Zhao, Chen, et al.
Published: (2024)
by: Zhao, Chen, et al.
Published: (2024)
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
by: Wei, Ziming, et al.
Published: (2025)
by: Wei, Ziming, et al.
Published: (2025)
Scalable Unseen Objects 6-DoF Absolute Pose Estimation with Robotic Integration
by: Liu, Jian, et al.
Published: (2025)
by: Liu, Jian, et al.
Published: (2025)
Dream-SLAM: Dreaming the Unseen for Active SLAM in Dynamic Environments
by: Meng, Xiangqi, et al.
Published: (2026)
by: Meng, Xiangqi, et al.
Published: (2026)
FoundPose: Unseen Object Pose Estimation with Foundation Features
by: Örnek, Evin Pınar, et al.
Published: (2023)
by: Örnek, Evin Pınar, et al.
Published: (2023)
Adapting Segment Anything Model for Unseen Object Instance Segmentation
by: Cao, Rui, et al.
Published: (2024)
by: Cao, Rui, et al.
Published: (2024)
NextBestPath: Efficient 3D Mapping of Unseen Environments
by: Li, Shiyao, et al.
Published: (2025)
by: Li, Shiyao, et al.
Published: (2025)
Navigating Beyond Instructions: Vision-and-Language Navigation in Obstructed Environments
by: Hong, Haodong, et al.
Published: (2024)
by: Hong, Haodong, et al.
Published: (2024)
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments
by: An, Dong, et al.
Published: (2023)
by: An, Dong, et al.
Published: (2023)
UAS Visual Navigation in Large and Unseen Environments via a Meta Agent
by: Han, Yuci, et al.
Published: (2025)
by: Han, Yuci, et al.
Published: (2025)
LocPoseNet: Robust Location Prior for Unseen Object Pose Estimation
by: Zhao, Chen, et al.
Published: (2022)
by: Zhao, Chen, et al.
Published: (2022)
Enhancing Underwater Object Detection through Spatio-Temporal Analysis and Spatial Attention Networks
by: Karri, Sai Likhith, et al.
Published: (2025)
by: Karri, Sai Likhith, et al.
Published: (2025)
DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions
by: Korekata, Ryosuke, et al.
Published: (2024)
by: Korekata, Ryosuke, et al.
Published: (2024)
ArtReg: Visuo-Tactile based Pose Tracking and Manipulation of Unseen Articulated Objects
by: Murali, Prajval Kumar, et al.
Published: (2025)
by: Murali, Prajval Kumar, et al.
Published: (2025)
BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects
by: Hodan, Tomas, et al.
Published: (2024)
by: Hodan, Tomas, et al.
Published: (2024)
Holodeck: Language Guided Generation of 3D Embodied AI Environments
by: Yang, Yue, et al.
Published: (2023)
by: Yang, Yue, et al.
Published: (2023)
Segmenting Known Objects and Unseen Unknowns without Prior Knowledge
by: Gasperini, Stefano, et al.
Published: (2022)
by: Gasperini, Stefano, et al.
Published: (2022)
FLAME: Learning to Navigate with Multimodal LLM in Urban Environments
by: Xu, Yunzhe, et al.
Published: (2024)
by: Xu, Yunzhe, et al.
Published: (2024)
Ground-level Viewpoint Vision-and-Language Navigation in Continuous Environments
by: Li, Zerui, et al.
Published: (2025)
by: Li, Zerui, et al.
Published: (2025)
AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
by: Qian, Kangan, et al.
Published: (2025)
by: Qian, Kangan, et al.
Published: (2025)
CAMON: Cooperative Agents for Multi-Object Navigation with LLM-based Conversations
by: Wu, Pengying, et al.
Published: (2024)
by: Wu, Pengying, et al.
Published: (2024)
CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
GFreeDet: Exploiting Gaussian Splatting and Foundation Models for Model-free Unseen Object Detection in the BOP Challenge 2024
by: Liu, Xingyu, et al.
Published: (2024)
by: Liu, Xingyu, et al.
Published: (2024)
InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment
by: Long, Yuxing, et al.
Published: (2024)
by: Long, Yuxing, et al.
Published: (2024)
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
by: Mitra, Chancharik, et al.
Published: (2023)
by: Mitra, Chancharik, et al.
Published: (2023)
VLN-NF: Feasibility-Aware Vision-and-Language Navigation with False-Premise Instructions
by: Su, Hung-Ting, et al.
Published: (2026)
by: Su, Hung-Ting, et al.
Published: (2026)
Language-Based Augmentation to Address Shortcut Learning in Object Goal Navigation
by: Hoftijzer, Dennis, et al.
Published: (2024)
by: Hoftijzer, Dennis, et al.
Published: (2024)
The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via Differentiable Token Pruning
by: Jiang, Titong, et al.
Published: (2025)
by: Jiang, Titong, et al.
Published: (2025)
Open-vocabulary Mobile Manipulation in Unseen Dynamic Environments with 3D Semantic Maps
by: Qiu, Dicong, et al.
Published: (2024)
by: Qiu, Dicong, et al.
Published: (2024)
UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object Tracking
by: Lee, Chang Won, et al.
Published: (2024)
by: Lee, Chang Won, et al.
Published: (2024)
MiMo-Embodied: X-Embodied Foundation Model Technical Report
by: Hao, Xiaoshuai, et al.
Published: (2025)
by: Hao, Xiaoshuai, et al.
Published: (2025)
Robotic-CLIP: Fine-tuning CLIP on Action Data for Robotic Applications
by: Nguyen, Nghia, et al.
Published: (2024)
by: Nguyen, Nghia, et al.
Published: (2024)
Xiaomi OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
by: Lu, Jinghui, et al.
Published: (2026)
by: Lu, Jinghui, et al.
Published: (2026)
Virtual Community: An Open World for Humans, Robots, and Society
by: Zhou, Qinhong, et al.
Published: (2025)
by: Zhou, Qinhong, et al.
Published: (2025)
CLIP-Clique: Graph-based Correspondence Matching Augmented by Vision Language Models for Object-based Global Localization
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
Online Mapping for Autonomous Driving: Addressing Sensor Generalization and Dynamic Map Updates in Campus Environments
by: Zhang, Zihan, et al.
Published: (2025)
by: Zhang, Zihan, et al.
Published: (2025)
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
by: Hong, Yining, et al.
Published: (2026)
by: Hong, Yining, et al.
Published: (2026)
A Language Agent for Autonomous Driving
by: Mao, Jiageng, et al.
Published: (2023)
by: Mao, Jiageng, et al.
Published: (2023)
An Efficient Method for Accurate Pose Estimation and Error Correction of Cuboidal Objects
by: Rai, Utsav, et al.
Published: (2025)
by: Rai, Utsav, et al.
Published: (2025)
Similar Items
-
DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
by: Wang, Zhaowei, et al.
Published: (2024) -
DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects
by: Zhao, Chen, et al.
Published: (2024) -
Unseen from Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation
by: Wei, Ziming, et al.
Published: (2025) -
Scalable Unseen Objects 6-DoF Absolute Pose Estimation with Robotic Integration
by: Liu, Jian, et al.
Published: (2025) -
Dream-SLAM: Dreaming the Unseen for Active SLAM in Dynamic Environments
by: Meng, Xiangqi, et al.
Published: (2026)