:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ispas, Alex-Răzvan, Deschamps-Berger, Théo, Devillers, Laurence
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Artificial Intelligence I.2.7
Online Access:	https://arxiv.org/abs/2401.00536
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Lighter and Robust Evaluation for Retrieval Augmented Generation
by: Ispas, Alex-Razvan, et al.
Published: (2025)

University of Indonesia at SemEval-2025 Task 11: Evaluating State-of-the-Art Encoders for Multi-Label Emotion Detection
by: Hanif, Ikhlasul Akmal, et al.
Published: (2025)

Whether, Not Which: Mechanistic Interpretability Reveals Dissociable Affect Reception and Emotion Categorization in LLMs
by: Keeman, Michael
Published: (2026)

Text-Based Approaches to Item Difficulty Modeling in Large-Scale Assessments: A Systematic Review
by: Peters, Sydney, et al.
Published: (2025)

PetKaz at SemEval-2024 Task 3: Advancing Emotion Classification with an LLM for Emotion-Cause Pair Extraction in Conversations
by: Kazakov, Roman, et al.
Published: (2024)

From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations
by: Baskar, Sarvesh, et al.
Published: (2025)

A Graph-based Approach for Multi-Modal Question Answering from Flowcharts in Telecom Documents
by: Soman, Sumit, et al.
Published: (2025)

RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
by: Saji, Alan, et al.
Published: (2025)

EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models
by: Paech, Samuel J.
Published: (2023)

Enhancing Transformer RNNs with Multiple Temporal Perspectives
by: Dumitru, Razvan-Gabriel, et al.
Published: (2024)

Integrating Emotional and Linguistic Models for Ethical Compliance in Large Language Models
by: Chang, Edward Y.
Published: (2024)

Residual Drift Dominates Contradiction in Multi-Turn Constraint Reasoning
by: Kawada, Sebastien
Published: (2026)

Multi-Turn Interactions for Text-to-SQL with Large Language Models
by: Xiong, Guanming, et al.
Published: (2024)

Project Riley: Multimodal Multi-Agent LLM Collaboration with Emotional Reasoning and Voting
by: Ortigoso, Ana Rita, et al.
Published: (2025)

From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
by: Choi, Eunseong, et al.
Published: (2024)

Dynamic Domain Information Modulation Algorithm for Multi-domain Sentiment Analysis
by: Yue, Chunyi, et al.
Published: (2025)

Persona Inconstancy in Multi-Agent LLM Collaboration: Conformity, Confabulation, and Impersonation
by: Baltaji, Razan, et al.
Published: (2024)

USTCCTSU at SemEval-2024 Task 1: Reducing Anisotropy for Cross-lingual Semantic Textual Relatedness Task
by: Li, Jianjian, et al.
Published: (2024)

Instructional Agents: Reducing Teaching Faculty Workload through Multi-Agent Instructional Design
by: Yao, Huaiyuan, et al.
Published: (2025)

Seemingly Plausible Distractors in Multi-Hop Reasoning: Are Large Language Models Attentive Readers?
by: Bhuiya, Neeladri, et al.
Published: (2024)

ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)

CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising Quality
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)

Multi-trait User Simulation with Adaptive Decoding for Conversational Task Assistants
by: Ferreira, Rafael, et al.
Published: (2024)

Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models
by: Xiong, Guanming, et al.
Published: (2024)

HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent
by: Xu, Weijie, et al.
Published: (2024)

HyperPersona: A Multi-Level Hypergraph Framework for Text-Based Automatic Personality Prediction
by: Heydari, Sina, et al.
Published: (2026)

MedPI: Evaluating AI Systems in Medical Patient-facing Interactions
by: V., Diego Fajardo, et al.
Published: (2025)

Hallucination or Creativity: How to Evaluate AI-Generated Scientific Stories?
by: Argese, Alex, et al.
Published: (2026)

Alignment Backfire: Language-Dependent Reversal of Safety Interventions Across 16 Languages in LLM Multi-Agent Systems
by: Fukui, Hiroki
Published: (2026)

Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks
by: Tahir, Munief Hassan, et al.
Published: (2024)

Can Large Language Models Grasp Legal Theories? Enhance Legal Reasoning with Insights from Multi-Agent Collaboration
by: Yuan, Weikang, et al.
Published: (2024)

Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks
by: Buszydlik, Aleksander, et al.
Published: (2023)

A Survey of Task-Oriented Knowledge Graph Reasoning: Status, Applications, and Prospects
by: Niu, Guanglin, et al.
Published: (2025)

Generative Active Testing: Efficient LLM Evaluation via Proxy Task Adaptation
by: Ramakrishnan, Aashish Anantha, et al.
Published: (2026)

Beyond Black-Box Labels: Interpretable Criteria for Diagnosing Subjective NLP Tasks
by: Rair, Nisrine, et al.
Published: (2026)

Predictive Simultaneous Interpretation: Harnessing Large Language Models for Democratizing Real-Time Multilingual Communication
by: Iida, Kurando, et al.
Published: (2024)

K-MetBench: A Multi-Dimensional Benchmark for Fine-Grained Evaluation of Expert Reasoning, Locality, and Multimodality in Meteorology
by: Kim, Soyeon, et al.
Published: (2026)

Yes-MT's Submission to the Low-Resource Indic Language Translation Shared Task in WMT 2024
by: Bhaskar, Yash, et al.
Published: (2025)

Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks
by: Uenal, Fatih
Published: (2026)

Evaluating the Efficacy of Hybrid Deep Learning Models in Distinguishing AI-Generated Text
by: Oketunji, Abiodun Finbarrs
Published: (2023)