Saved in:
Bibliographic Details
Main Authors: Bystrich, Tobias, Hamm, Lukas, Hassan, Maria, Fischbach, Lea, Flek, Lucie, Karimi, Akbar
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.29541
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915901489545216
author Bystrich, Tobias
Hamm, Lukas
Hassan, Maria
Fischbach, Lea
Flek, Lucie
Karimi, Akbar
author_facet Bystrich, Tobias
Hamm, Lukas
Hassan, Maria
Fischbach, Lea
Flek, Lucie
Karimi, Akbar
contents Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions can be beneficial for such classifications, but also presents opportunities for improvement.
format Preprint
id arxiv_https___arxiv_org_abs_2603_29541
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Can LLM Agents Identify Spoken Dialects like a Linguist?
Bystrich, Tobias
Hamm, Lukas
Hassan, Maria
Fischbach, Lea
Flek, Lucie
Karimi, Akbar
Computation and Language
Due to the scarcity of labeled dialectal speech, audio dialect classification is a challenging task for most languages, including Swiss German. In this work, we explore the ability of large language models (LLMs) as agents in understanding the dialects and whether they can show comparable performance to models such as HuBERT in dialect classification. In addition, we provide an LLM baseline and a human linguist one. Our approach uses phonetic transcriptions produced by ASR systems and combines them with linguistic resources such as dialect feature maps, vowel history, and rules. Our findings indicate that, when linguistic information is provided, the LLM predictions improve. The human baseline shows that automatically generated transcriptions can be beneficial for such classifications, but also presents opportunities for improvement.
title Can LLM Agents Identify Spoken Dialects like a Linguist?
topic Computation and Language
url https://arxiv.org/abs/2603.29541