:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nishimura, Takayuki, Kuyo, Katsuyuki, Kambara, Motonari, Sugiura, Komei
Format:	Preprint
Published:	2024
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2407.00985
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories
by: Kambara, Motonari, et al.
Published: (2024)

Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations
by: Goko, Miyu, et al.
Published: (2024)

Pre-Manipulation Alignment Prediction with Parallel Deep State-Space and Transformer Models
by: Kambara, Motonari, et al.
Published: (2025)

DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions
by: Korekata, Ryosuke, et al.
Published: (2024)

Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
by: Katsumata, Kei, et al.
Published: (2025)

Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
by: Yashima, Daichi, et al.
Published: (2024)

LILAC: Language-Conditioned Object-Centric Optical Flow for Open-Loop Trajectory Generation
by: Kambara, Motonari, et al.
Published: (2026)

AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation
by: Takagi, Yusuke, et al.
Published: (2026)

Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation
by: Korekata, Ryosuke, et al.
Published: (2025)

GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions
by: Katsumata, Kei, et al.
Published: (2025)

Nearest Neighbor Future Captioning: Generating Descriptions for Possible Collisions in Object Placement Tasks
by: Komatsu, Takumi, et al.
Published: (2024)

LOVON: Legged Open-Vocabulary Object Navigator
by: Peng, Daojie, et al.
Published: (2025)

HomeRobot: Open-Vocabulary Mobile Manipulation
by: Yenamandra, Sriram, et al.
Published: (2023)

Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking
by: Pätzold, Bastian, et al.
Published: (2025)

WildOS: Open-Vocabulary Object Search in the Wild
by: Shah, Hardik, et al.
Published: (2026)

Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling
by: Qiu, Xiaowen, et al.
Published: (2025)

Object-Centric Instruction Augmentation for Robotic Manipulation
by: Wen, Junjie, et al.
Published: (2024)

Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking
by: Ishaq, Ayesha, et al.
Published: (2024)

Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation
by: Dong, Runpei, et al.
Published: (2026)

Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects
by: Wang, Jiawei, et al.
Published: (2025)

Towards Open-World Mobile Manipulation in Homes: Lessons from the Neurips 2023 HomeRobot Open Vocabulary Mobile Manipulation Challenge
by: Yenamandra, Sriram, et al.
Published: (2024)

WoMAP: World Models For Embodied Open-Vocabulary Object Localization
by: Yin, Tenny, et al.
Published: (2025)

OVGrasp: Open-Vocabulary Grasping Assistance via Multimodal Intent Detection
by: Hu, Chen, et al.
Published: (2025)

Adaptive Articulated Object Manipulation On The Fly with Foundation Model Reasoning and Part Grounding
by: Zhang, Xiaojie, et al.
Published: (2025)

Splat-MOVER: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting
by: Shorinwa, Ola, et al.
Published: (2024)

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments
by: Ma, Ji, et al.
Published: (2024)

ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration
by: Zhu, Minjie, et al.
Published: (2025)

Zero-shot Object-Centric Instruction Following: Integrating Foundation Models with Traditional Navigation
by: Raychaudhuri, Sonia, et al.
Published: (2024)

HomeRobot Open Vocabulary Mobile Manipulation Challenge 2023 Participant Report (Team KuzHum)
by: Kuzma, Volodymyr, et al.
Published: (2024)

Attention Lattice Adapter: Visual Explanation Generation for Visual Foundation Model
by: Hirano, Shinnosuke, et al.
Published: (2025)

Spot-Compose: A Framework for Open-Vocabulary Object Retrieval and Drawer Manipulation in Point Clouds
by: Lemke, Oliver, et al.
Published: (2024)

ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?
by: Kim, Taewhan, et al.
Published: (2024)

SpaCeFormer: Fast Proposal-Free Open-Vocabulary 3D Instance Segmentation
by: Choy, Chris, et al.
Published: (2026)

OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding
by: Deng, Yinan, et al.
Published: (2024)

Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images
by: Nagashima, Shunya, et al.
Published: (2025)

OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
by: Cai, Junhao, et al.
Published: (2024)

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
by: Zhi, Peiyuan, et al.
Published: (2024)

DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
by: Wang, Zhaowei, et al.
Published: (2024)

ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
by: Zhang, Ying, et al.
Published: (2025)

Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites
by: Abdalwhab, Abdalwhab, et al.
Published: (2025)