Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Guan-Ting, Lee, Hung-yi
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.11065
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909328247619584
author	Lin, Guan-Ting Lee, Hung-yi
author_facet	Lin, Guan-Ting Lee, Hung-yi
contents	Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. While Large Language Models (LLMs) have revolutionized natural language processing, their ability to understand emphasis in dialogue remains unclear. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various LLMs, both open-source and commercial, to measure their performance in understanding emphasis. Additionally, we propose an automatic evaluation pipeline using GPT-4, which achieves a high correlation with human rating. Our findings reveal that although commercial LLMs generally perform better, there is still significant room for improvement in comprehending emphasized sentences.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_11065
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? Lin, Guan-Ting Lee, Hung-yi Computation and Language Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue. While Large Language Models (LLMs) have revolutionized natural language processing, their ability to understand emphasis in dialogue remains unclear. This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis. We evaluate various LLMs, both open-source and commercial, to measure their performance in understanding emphasis. Additionally, we propose an automatic evaluation pipeline using GPT-4, which achieves a high correlation with human rating. Our findings reveal that although commercial LLMs generally perform better, there is still significant room for improvement in comprehending emphasized sentences.
title	Can LLMs Understand the Implication of Emphasized Sentences in Dialogue?
topic	Computation and Language
url	https://arxiv.org/abs/2406.11065

Similar Items