Enregistré dans:
Détails bibliographiques
Auteurs principaux: Li, Yiheng, Fan, Cunxin, Ge, Chongjian, Zhao, Zhihao, Li, Chenran, Xu, Chenfeng, Yao, Huaxiu, Tomizuka, Masayoshi, Zhou, Bolei, Tang, Chen, Ding, Mingyu, Zhan, Wei
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2407.04281
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866913856278757376
author Li, Yiheng
Fan, Cunxin
Ge, Chongjian
Zhao, Zhihao
Li, Chenran
Xu, Chenfeng
Yao, Huaxiu
Tomizuka, Masayoshi
Zhou, Bolei
Tang, Chen
Ding, Mingyu
Zhan, Wei
author_facet Li, Yiheng
Fan, Cunxin
Ge, Chongjian
Zhao, Zhihao
Li, Chenran
Xu, Chenfeng
Yao, Huaxiu
Tomizuka, Masayoshi
Zhou, Bolei
Tang, Chen
Ding, Mingyu
Zhan, Wei
contents Language models uncover unprecedented abilities in analyzing driving scenarios, owing to their limitless knowledge accumulated from text-based pre-training. Naturally, they should particularly excel in analyzing rule-based interactions, such as those triggered by traffic laws, which are well documented in texts. However, such interaction analysis remains underexplored due to the lack of dedicated language datasets that address it. Therefore, we propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a comprehensive large-scale Q&As dataset built on WOMD focusing on describing and reasoning traffic rule-induced interactions in driving scenarios. WOMD-Reasoning also presents by far the largest multi-modal Q&A dataset, with 3 million Q&As on real-world driving scenarios, covering a wide range of driving topics from map descriptions and motion status descriptions to narratives and analyses of agents' interactions, behaviors, and intentions. To showcase the applications of WOMD-Reasoning, we design Motion-LLaVA, a motion-language model fine-tuned on WOMD-Reasoning. Quantitative and qualitative evaluations are performed on WOMD-Reasoning dataset as well as the outputs of Motion-LLaVA, supporting the data quality and wide applications of WOMD-Reasoning, in interaction predictions, traffic rule compliance plannings, etc. The dataset and its vision modal extension are available on https://waymo.com/open/download/. The codes & prompts to build it are available on https://github.com/yhli123/WOMD-Reasoning.
format Preprint
id arxiv_https___arxiv_org_abs_2407_04281
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving
Li, Yiheng
Fan, Cunxin
Ge, Chongjian
Zhao, Zhihao
Li, Chenran
Xu, Chenfeng
Yao, Huaxiu
Tomizuka, Masayoshi
Zhou, Bolei
Tang, Chen
Ding, Mingyu
Zhan, Wei
Robotics
Language models uncover unprecedented abilities in analyzing driving scenarios, owing to their limitless knowledge accumulated from text-based pre-training. Naturally, they should particularly excel in analyzing rule-based interactions, such as those triggered by traffic laws, which are well documented in texts. However, such interaction analysis remains underexplored due to the lack of dedicated language datasets that address it. Therefore, we propose Waymo Open Motion Dataset-Reasoning (WOMD-Reasoning), a comprehensive large-scale Q&As dataset built on WOMD focusing on describing and reasoning traffic rule-induced interactions in driving scenarios. WOMD-Reasoning also presents by far the largest multi-modal Q&A dataset, with 3 million Q&As on real-world driving scenarios, covering a wide range of driving topics from map descriptions and motion status descriptions to narratives and analyses of agents' interactions, behaviors, and intentions. To showcase the applications of WOMD-Reasoning, we design Motion-LLaVA, a motion-language model fine-tuned on WOMD-Reasoning. Quantitative and qualitative evaluations are performed on WOMD-Reasoning dataset as well as the outputs of Motion-LLaVA, supporting the data quality and wide applications of WOMD-Reasoning, in interaction predictions, traffic rule compliance plannings, etc. The dataset and its vision modal extension are available on https://waymo.com/open/download/. The codes & prompts to build it are available on https://github.com/yhli123/WOMD-Reasoning.
title WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving
topic Robotics
url https://arxiv.org/abs/2407.04281