Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Nigam, Shubham Kumar, Sarkar, Suparnojit, Patel, Piyush
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence Information Retrieval Machine Learning
Online Access:	https://arxiv.org/abs/2605.13292
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Most existing medical dialogue systems operate in a single-turn question--answering paradigm or rely on template-based datasets, limiting conversational realism and multilingual applicability. We introduce IndicMedDialog, a parallel multi-turn medical dialogue dataset spanning English and nine Indic languages: Assamese, Bengali, Gujarati, Hindi, Marathi, Punjabi, Tamil, Telugu, and Urdu. The dataset extends MDDial with LLM-generated synthetic consultations, translated using TranslateGemma, verified by native speakers, and refined through a script-aware post-processing pipeline to correct phonetic, lexical, and character-spacing errors. Building on this dataset, we fine-tune IndicMedLM via parameter-efficient adaptation of a quantized small language model, incorporating optional patient pre-context to personalise multi-turn symptom elicitation. We evaluate against zero-shot multilingual baselines, conduct systematic error analysis across ten languages, and validate clinical plausibility through medical expert evaluation.

Similar Items