Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bystrich, Tobias, Hamm, Lukas, Hassan, Maria, Fischbach, Lea, Flek, Lucie, Karimi, Akbar
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.29541
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915901489545216
author	Bystrich, Tobias Hamm, Lukas Hassan, Maria Fischbach, Lea Flek, Lucie Karimi, Akbar
author_facet	Bystrich, Tobias Hamm, Lukas Hassan, Maria Fischbach, Lea Flek, Lucie Karimi, Akbar
contents	Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions can be beneficial for such classifications, but also presents opportunities for improvement.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_29541
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Can LLM Agents Identify Spoken Dialects like a Linguist? Bystrich, Tobias Hamm, Lukas Hassan, Maria Fischbach, Lea Flek, Lucie Karimi, Akbar Computation and Language Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions can be beneficial for such classifications, but also presents opportunities for improvement.
title	Can LLM Agents Identify Spoken Dialects like a Linguist?
topic	Computation and Language
url	https://arxiv.org/abs/2603.29541

Similar Items