Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Luo, Xuan, Yao, Lewei, Zhao, Libo, Hong, Lanqing, Chen, Kai, Tao, Dehua, Tan, Daxin, Xu, Ruifeng, Li, Jing
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Human-Computer Interaction
Online Access:	https://arxiv.org/abs/2601.10513
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917205653848064
author	Luo, Xuan Yao, Lewei Zhao, Libo Hong, Lanqing Chen, Kai Tao, Dehua Tan, Daxin Xu, Ruifeng Li, Jing
author_facet	Luo, Xuan Yao, Lewei Zhao, Libo Hong, Lanqing Chen, Kai Tao, Dehua Tan, Daxin Xu, Ruifeng Li, Jing
contents	While the automatic evaluation of omni-modal large models (OLMs) is essential, assessing empathy remains a significant challenge due to its inherent affectivity. To investigate this challenge, we introduce AEQ-Bench (Audio Empathy Quotient Benchmark), a novel benchmark to systematically assess two core empathetic capabilities of OLMs: (i) generating empathetic responses by comprehending affective cues from multi-modal inputs (audio + text), and (ii) judging the empathy of audio responses without relying on text transcription. Compared to existing benchmarks, AEQ-Bench incorporates two novel settings that vary in context specificity and speech tone. Comprehensive assessment across linguistic and paralinguistic metrics reveals that (1) OLMs trained with audio output capabilities generally outperformed models with text-only outputs, and (2) while OLMs align with human judgments for coarse-grained quality assessment, they remain unreliable for evaluating fine-grained paralinguistic expressiveness.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_10513
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	AEQ-Bench: Measuring Empathy of Omni-Modal Large Models Luo, Xuan Yao, Lewei Zhao, Libo Hong, Lanqing Chen, Kai Tao, Dehua Tan, Daxin Xu, Ruifeng Li, Jing Computation and Language Human-Computer Interaction While the automatic evaluation of omni-modal large models (OLMs) is essential, assessing empathy remains a significant challenge due to its inherent affectivity. To investigate this challenge, we introduce AEQ-Bench (Audio Empathy Quotient Benchmark), a novel benchmark to systematically assess two core empathetic capabilities of OLMs: (i) generating empathetic responses by comprehending affective cues from multi-modal inputs (audio + text), and (ii) judging the empathy of audio responses without relying on text transcription. Compared to existing benchmarks, AEQ-Bench incorporates two novel settings that vary in context specificity and speech tone. Comprehensive assessment across linguistic and paralinguistic metrics reveals that (1) OLMs trained with audio output capabilities generally outperformed models with text-only outputs, and (2) while OLMs align with human judgments for coarse-grained quality assessment, they remain unreliable for evaluating fine-grained paralinguistic expressiveness.
title	AEQ-Bench: Measuring Empathy of Omni-Modal Large Models
topic	Computation and Language Human-Computer Interaction
url	https://arxiv.org/abs/2601.10513

Similar Items