Saved in:
| Main Authors: | Korekata, Ryosuke, Kaneda, Kanta, Nagashima, Shunya, Imai, Yuto, Sugiura, Komei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.07910 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
by: Yashima, Daichi, et al.
Published: (2024)
by: Yashima, Daichi, et al.
Published: (2024)
Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation
by: Korekata, Ryosuke, et al.
Published: (2025)
by: Korekata, Ryosuke, et al.
Published: (2025)
Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models
by: Nishimura, Takayuki, et al.
Published: (2024)
by: Nishimura, Takayuki, et al.
Published: (2024)
Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories
by: Kambara, Motonari, et al.
Published: (2024)
by: Kambara, Motonari, et al.
Published: (2024)
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
by: Katsumata, Kei, et al.
Published: (2025)
by: Katsumata, Kei, et al.
Published: (2025)
Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images
by: Nagashima, Shunya, et al.
Published: (2025)
by: Nagashima, Shunya, et al.
Published: (2025)
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
by: Wada, Yuiga, et al.
Published: (2024)
by: Wada, Yuiga, et al.
Published: (2024)
Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations
by: Goko, Miyu, et al.
Published: (2024)
by: Goko, Miyu, et al.
Published: (2024)
FLARE-SSM: Deep State Space Models with Influence-Balanced Loss for 72-Hour Solar Flare Prediction
by: Takagi, Yusuke, et al.
Published: (2025)
by: Takagi, Yusuke, et al.
Published: (2025)
Cortical-SSM: A Deep State Space Model for EEG and ECoG Motor Imagery Decoding
by: Suzuki, Shuntaro, et al.
Published: (2025)
by: Suzuki, Shuntaro, et al.
Published: (2025)
NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries
by: Amemiya, Kanon, et al.
Published: (2026)
by: Amemiya, Kanon, et al.
Published: (2026)
Co-Scale Cross-Attentional Transformer for Rearrangement Target Detection
by: Matsuo, Haruka, et al.
Published: (2024)
by: Matsuo, Haruka, et al.
Published: (2024)
LILAC: Language-Conditioned Object-Centric Optical Flow for Open-Loop Trajectory Generation
by: Kambara, Motonari, et al.
Published: (2026)
by: Kambara, Motonari, et al.
Published: (2026)
Pre-Manipulation Alignment Prediction with Parallel Deep State-Space and Transformer Models
by: Kambara, Motonari, et al.
Published: (2025)
by: Kambara, Motonari, et al.
Published: (2025)
GENNAV: Polygon Mask Generation for Generalized Referring Navigable Regions
by: Katsumata, Kei, et al.
Published: (2025)
by: Katsumata, Kei, et al.
Published: (2025)
LOVON: Legged Open-Vocabulary Object Navigator
by: Peng, Daojie, et al.
Published: (2025)
by: Peng, Daojie, et al.
Published: (2025)
WildOS: Open-Vocabulary Object Search in the Wild
by: Shah, Hardik, et al.
Published: (2026)
by: Shah, Hardik, et al.
Published: (2026)
Open3DTrack: Towards Open-Vocabulary 3D Multi-Object Tracking
by: Ishaq, Ayesha, et al.
Published: (2024)
by: Ishaq, Ayesha, et al.
Published: (2024)
Kinematify: Open-Vocabulary Synthesis of High-DoF Articulated Objects
by: Wang, Jiawei, et al.
Published: (2025)
by: Wang, Jiawei, et al.
Published: (2025)
OVGrasp: Open-Vocabulary Grasping Assistance via Multimodal Intent Detection
by: Hu, Chen, et al.
Published: (2025)
by: Hu, Chen, et al.
Published: (2025)
Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling
by: Qiu, Xiaowen, et al.
Published: (2025)
by: Qiu, Xiaowen, et al.
Published: (2025)
DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments
by: Ma, Ji, et al.
Published: (2024)
by: Ma, Ji, et al.
Published: (2024)
WoMAP: World Models For Embodied Open-Vocabulary Object Localization
by: Yin, Tenny, et al.
Published: (2025)
by: Yin, Tenny, et al.
Published: (2025)
DivScene: Towards Open-Vocabulary Object Navigation with Large Vision Language Models in Diverse Scenes
by: Wang, Zhaowei, et al.
Published: (2024)
by: Wang, Zhaowei, et al.
Published: (2024)
OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding
by: Deng, Yinan, et al.
Published: (2024)
by: Deng, Yinan, et al.
Published: (2024)
OV9D: Open-Vocabulary Category-Level 9D Object Pose and Size Estimation
by: Cai, Junhao, et al.
Published: (2024)
by: Cai, Junhao, et al.
Published: (2024)
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes
by: Jiang, Jiajun, et al.
Published: (2025)
by: Jiang, Jiajun, et al.
Published: (2025)
FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment
by: Laina, Sebastián Barbas, et al.
Published: (2025)
by: Laina, Sebastián Barbas, et al.
Published: (2025)
ZINA: Multimodal Fine-grained Hallucination Detection and Editing
by: Wada, Yuiga, et al.
Published: (2025)
by: Wada, Yuiga, et al.
Published: (2025)
Target-Oriented Object Grasping via Multimodal Human Guidance
by: Xie, Pengwei, et al.
Published: (2024)
by: Xie, Pengwei, et al.
Published: (2024)
Open-Vocabulary Online Semantic Mapping for SLAM
by: Martins, Tomas Berriel, et al.
Published: (2024)
by: Martins, Tomas Berriel, et al.
Published: (2024)
OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies
by: Kong, Lingdong, et al.
Published: (2024)
by: Kong, Lingdong, et al.
Published: (2024)
OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
by: Jiang, Haochen, et al.
Published: (2024)
by: Jiang, Haochen, et al.
Published: (2024)
HomeRobot: Open-Vocabulary Mobile Manipulation
by: Yenamandra, Sriram, et al.
Published: (2023)
by: Yenamandra, Sriram, et al.
Published: (2023)
NOVA: Next-step Open-Vocabulary Autoregression for 3D Multi-Object Tracking in Autonomous Driving
by: Luo, Kai, et al.
Published: (2026)
by: Luo, Kai, et al.
Published: (2026)
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
by: Wu, Yanmin, et al.
Published: (2024)
by: Wu, Yanmin, et al.
Published: (2024)
HiFlow: Tokenization-Free Scale-Wise Autoregressive Policy Learning via Flow Matching
by: Yashima, Daichi, et al.
Published: (2026)
by: Yashima, Daichi, et al.
Published: (2026)
OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving
by: Yan, Tianyi, et al.
Published: (2024)
by: Yan, Tianyi, et al.
Published: (2024)
Are Open-Vocabulary Models Ready for Detection of MEP Elements on Construction Sites
by: Abdalwhab, Abdalwhab, et al.
Published: (2025)
by: Abdalwhab, Abdalwhab, et al.
Published: (2025)
Leveraging Vision-Language Models for Open-Vocabulary Instance Segmentation and Tracking
by: Pätzold, Bastian, et al.
Published: (2025)
by: Pätzold, Bastian, et al.
Published: (2025)
Similar Items
-
Open-Vocabulary Mobile Manipulation Based on Double Relaxed Contrastive Learning with Dense Labeling
by: Yashima, Daichi, et al.
Published: (2024) -
Affordance RAG: Hierarchical Multimodal Retrieval with Affordance-Aware Embodied Memory for Mobile Manipulation
by: Korekata, Ryosuke, et al.
Published: (2025) -
Object Segmentation from Open-Vocabulary Manipulation Instructions Based on Optimal Transport Polygon Matching with Multimodal Foundation Models
by: Nishimura, Takayuki, et al.
Published: (2024) -
Future Success Prediction in Open-Vocabulary Object Manipulation Tasks Based on End-Effector Trajectories
by: Kambara, Motonari, et al.
Published: (2024) -
Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement
by: Katsumata, Kei, et al.
Published: (2025)