Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chavda, Anshul, Jagadeesh, M, Kullayappa, Chintalapalli Raja, Jayaprakash, B, Sruthi, Medchalimi, Bhattacharyya, Pushpak
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2507.19867
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916865033371648
author	Chavda, Anshul Jagadeesh, M Kullayappa, Chintalapalli Raja Jayaprakash, B Sruthi, Medchalimi Bhattacharyya, Pushpak
author_facet	Chavda, Anshul Jagadeesh, M Kullayappa, Chintalapalli Raja Jayaprakash, B Sruthi, Medchalimi Bhattacharyya, Pushpak
contents	In-car conversational AI is becoming increasingly critical as autonomous vehicles and smart assistants gain widespread adoption. Yet, existing datasets fail to capture the spontaneous disfluencies such as hesitations, false starts, repetitions, and self-corrections that characterize real driver-AI dialogs. To address this, we introduce DiscoDrive, a synthetic corpus of 3500 multi-turn dialogs across seven automotive domains, generated using a two-stage, prompt-driven pipeline that dynamically integrates disfluencies during synthesis. We show that DiscoDrive is effective both as a training resource, enabling DialoGPT-Medium and T5-Base to match or exceed KVRET-trained models on the MultiWOZ 2.2 and Schema-Guided Dialogue (SGD) relevant test sets (BLEU-4 improvements of 0.26 to 0.61; METEOR +2.10; ROUGE-L +3.48; BERTScore F1 improvements of 1.35 to 3.48), and as a data augmentation resource in low-resource scenarios, delivering additional gains of up to BLEU-4 +0.38, METEOR +1.95, ROUGE-L +2.87, and BERTScore F1 +4.00 when combined with 10 percent of KVRET. Human evaluations further confirm that dialogs sampled from DiscoDrive are rated higher than KVRET's human-collected dialogs in naturalness (3.8 vs 3.6) and coherence (4.1 vs 4.0), and are perceived as more context-appropriate than leading post-hoc methods (such as LARD), without compromising clarity. DiscoDrive fills a critical gap in existing resources and serves as a versatile corpus for both training and augmenting conversational AI, enabling robust handling of real-world, disfluent in-car interactions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_19867
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	DRIVE: Disfluency-Rich Synthetic Dialog Data Generation Framework for Intelligent Vehicle Environments Chavda, Anshul Jagadeesh, M Kullayappa, Chintalapalli Raja Jayaprakash, B Sruthi, Medchalimi Bhattacharyya, Pushpak Computation and Language In-car conversational AI is becoming increasingly critical as autonomous vehicles and smart assistants gain widespread adoption. Yet, existing datasets fail to capture the spontaneous disfluencies such as hesitations, false starts, repetitions, and self-corrections that characterize real driver-AI dialogs. To address this, we introduce DiscoDrive, a synthetic corpus of 3500 multi-turn dialogs across seven automotive domains, generated using a two-stage, prompt-driven pipeline that dynamically integrates disfluencies during synthesis. We show that DiscoDrive is effective both as a training resource, enabling DialoGPT-Medium and T5-Base to match or exceed KVRET-trained models on the MultiWOZ 2.2 and Schema-Guided Dialogue (SGD) relevant test sets (BLEU-4 improvements of 0.26 to 0.61; METEOR +2.10; ROUGE-L +3.48; BERTScore F1 improvements of 1.35 to 3.48), and as a data augmentation resource in low-resource scenarios, delivering additional gains of up to BLEU-4 +0.38, METEOR +1.95, ROUGE-L +2.87, and BERTScore F1 +4.00 when combined with 10 percent of KVRET. Human evaluations further confirm that dialogs sampled from DiscoDrive are rated higher than KVRET's human-collected dialogs in naturalness (3.8 vs 3.6) and coherence (4.1 vs 4.0), and are perceived as more context-appropriate than leading post-hoc methods (such as LARD), without compromising clarity. DiscoDrive fills a critical gap in existing resources and serves as a versatile corpus for both training and augmenting conversational AI, enabling robust handling of real-world, disfluent in-car interactions.
title	DRIVE: Disfluency-Rich Synthetic Dialog Data Generation Framework for Intelligent Vehicle Environments
topic	Computation and Language
url	https://arxiv.org/abs/2507.19867

Similar Items