Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Wan, Zhaoliang, Ling, Yonggen, Yi, Senlin, Qi, Lu, Lee, Wangwei, Lu, Minglei, Yang, Sicheng, Teng, Xiao, Lu, Peng, Yang, Xu, Yang, Ming-Hsuan, Cheng, Hui
Formato:	Preprint
Publicado:	2024
Materias:	Robotics
Acceso en línea:	https://arxiv.org/abs/2501.00510
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866915091880869888
author	Wan, Zhaoliang Ling, Yonggen Yi, Senlin Qi, Lu Lee, Wangwei Lu, Minglei Yang, Sicheng Teng, Xiao Lu, Peng Yang, Xu Yang, Ming-Hsuan Cheng, Hui
author_facet	Wan, Zhaoliang Ling, Yonggen Yi, Senlin Qi, Lu Lee, Wangwei Lu, Minglei Yang, Sicheng Teng, Xiao Lu, Peng Yang, Xu Yang, Ming-Hsuan Cheng, Hui
contents	This paper addresses the scarcity of large-scale datasets for accurate object-in-hand pose estimation, which is crucial for robotic in-hand manipulation within the ``Perception-Planning-Control" paradigm. Specifically, we introduce VinT-6D, the first extensive multi-modal dataset integrating vision, touch, and proprioception, to enhance robotic manipulation. VinT-6D comprises 2 million VinT-Sim and 0.1 million VinT-Real splits, collected via simulations in MuJoCo and Blender and a custom-designed real-world platform. This dataset is tailored for robotic hands, offering models with whole-hand tactile perception and high-quality, well-aligned data. To the best of our knowledge, the VinT-Real is the largest considering the collection difficulties in the real-world environment so that it can bridge the gap of simulation to real compared to the previous works. Built upon VinT-6D, we present a benchmark method that shows significant improvements in performance by fusing multi-modal information. The project is available at https://VinT-6D.github.io/.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_00510
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception Wan, Zhaoliang Ling, Yonggen Yi, Senlin Qi, Lu Lee, Wangwei Lu, Minglei Yang, Sicheng Teng, Xiao Lu, Peng Yang, Xu Yang, Ming-Hsuan Cheng, Hui Robotics This paper addresses the scarcity of large-scale datasets for accurate object-in-hand pose estimation, which is crucial for robotic in-hand manipulation within the ``Perception-Planning-Control" paradigm. Specifically, we introduce VinT-6D, the first extensive multi-modal dataset integrating vision, touch, and proprioception, to enhance robotic manipulation. VinT-6D comprises 2 million VinT-Sim and 0.1 million VinT-Real splits, collected via simulations in MuJoCo and Blender and a custom-designed real-world platform. This dataset is tailored for robotic hands, offering models with whole-hand tactile perception and high-quality, well-aligned data. To the best of our knowledge, the VinT-Real is the largest considering the collection difficulties in the real-world environment so that it can bridge the gap of simulation to real compared to the previous works. Built upon VinT-6D, we present a benchmark method that shows significant improvements in performance by fusing multi-modal information. The project is available at https://VinT-6D.github.io/.
title	VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception
topic	Robotics
url	https://arxiv.org/abs/2501.00510

Ejemplares similares