Saved in:
| Main Authors: | Huber, Thomas, Niklaus, Christina |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15027 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models
by: Truong, Sang T., et al.
Published: (2024)
by: Truong, Sang T., et al.
Published: (2024)
Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
by: Zhou, Kevin, et al.
Published: (2025)
by: Zhou, Kevin, et al.
Published: (2025)
Towards LLM-based Autograding for Short Textual Answers
by: Schneider, Johannes, et al.
Published: (2023)
by: Schneider, Johannes, et al.
Published: (2023)
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
by: Yehudai, Asaf, et al.
Published: (2026)
by: Yehudai, Asaf, et al.
Published: (2026)
Automatic Input Rewriting Improves Translation with Large Language Models
by: Ki, Dayeon, et al.
Published: (2025)
by: Ki, Dayeon, et al.
Published: (2025)
CLEAR: A Clinically-Grounded Tabular Framework for Radiology Report Evaluation
by: Jiang, Yuyang, et al.
Published: (2025)
by: Jiang, Yuyang, et al.
Published: (2025)
Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring
by: Kubesch, Jonas, et al.
Published: (2026)
by: Kubesch, Jonas, et al.
Published: (2026)
Linguistic and Argument Diversity in Synthetic Data for Function-Calling Agents
by: Greenstein, Dan, et al.
Published: (2026)
by: Greenstein, Dan, et al.
Published: (2026)
CLEAR: Can Language Models Really Understand Causal Graphs?
by: Chen, Sirui, et al.
Published: (2024)
by: Chen, Sirui, et al.
Published: (2024)
Inductive Linguistic Reasoning with Large Language Models
by: Ramji, Raghav, et al.
Published: (2024)
by: Ramji, Raghav, et al.
Published: (2024)
A Comprehensive Evaluation of Quantization Strategies for Large Language Models
by: Jin, Renren, et al.
Published: (2024)
by: Jin, Renren, et al.
Published: (2024)
A Comprehensive Evaluation on Event Reasoning of Large Language Models
by: Tao, Zhengwei, et al.
Published: (2024)
by: Tao, Zhengwei, et al.
Published: (2024)
Are Large Language Models the future crowd workers of Linguistics?
by: Ferrazzo, Iris
Published: (2025)
by: Ferrazzo, Iris
Published: (2025)
Evaluating and Mitigating Linguistic Discrimination in Large Language Models
by: Dong, Guoliang, et al.
Published: (2024)
by: Dong, Guoliang, et al.
Published: (2024)
AddrLLM: Address Rewriting via Large Language Model on Nationwide Logistics Data
by: Yang, Qinchen, et al.
Published: (2024)
by: Yang, Qinchen, et al.
Published: (2024)
Rewriting Conversational Utterances with Instructed Large Language Models
by: Galimzhanova, Elnara, et al.
Published: (2024)
by: Galimzhanova, Elnara, et al.
Published: (2024)
Are Large Language Models Reliable Argument Quality Annotators?
by: Mirzakhmedova, Nailia, et al.
Published: (2024)
by: Mirzakhmedova, Nailia, et al.
Published: (2024)
Investigating Large Language Models' Linguistic Abilities for Text Preprocessing
by: Braga, Marco, et al.
Published: (2025)
by: Braga, Marco, et al.
Published: (2025)
Detecting Linguistic Indicators for Stereotype Assessment with Large Language Models
by: Görge, Rebekka, et al.
Published: (2025)
by: Görge, Rebekka, et al.
Published: (2025)
A Comprehensive Evaluation of Large Language Models on Mental Illnesses in Arabic Context
by: Zahran, Noureldin, et al.
Published: (2025)
by: Zahran, Noureldin, et al.
Published: (2025)
What is an "Abstract Reasoner"? Revisiting Experiments and Arguments about Large Language Models
by: Yun, Tian, et al.
Published: (2025)
by: Yun, Tian, et al.
Published: (2025)
WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models
by: Gupta, Prannaya, et al.
Published: (2024)
by: Gupta, Prannaya, et al.
Published: (2024)
A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis
by: Zhou, Changzhi, et al.
Published: (2024)
by: Zhou, Changzhi, et al.
Published: (2024)
SQLBench: A Comprehensive Evaluation for Text-to-SQL Capabilities of Large Language Models
by: Zhang, Bin, et al.
Published: (2024)
by: Zhang, Bin, et al.
Published: (2024)
Can One-sided Arguments Lead to Response Change in Large Language Models?
by: Cisneros-Velarde, Pedro
Published: (2026)
by: Cisneros-Velarde, Pedro
Published: (2026)
ArgLLM-App: An Interactive System for Argumentative Reasoning with Large Language Models
by: Dejl, Adam, et al.
Published: (2026)
by: Dejl, Adam, et al.
Published: (2026)
Signature vs. Substance: Evaluating the Balance of Adversarial Resistance and Linguistic Quality in Watermarking Large Language Models
by: Guo, William, et al.
Published: (2025)
by: Guo, William, et al.
Published: (2025)
CardiffNLP at CLEARS-2025: Prompting Large Language Models for Plain Language and Easy-to-Read Text Rewriting
by: Ayesh, Mutaz, et al.
Published: (2025)
by: Ayesh, Mutaz, et al.
Published: (2025)
Linguistic Blind Spots of Large Language Models
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
OphthBench: A Comprehensive Benchmark for Evaluating Large Language Models in Chinese Ophthalmology
by: Zhou, Chengfeng, et al.
Published: (2025)
by: Zhou, Chengfeng, et al.
Published: (2025)
Dynamic Knowledge Integration for Evidence-Driven Counter-Argument Generation with Large Language Models
by: Yeginbergen, Anar, et al.
Published: (2025)
by: Yeginbergen, Anar, et al.
Published: (2025)
Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond
by: Xu, Fangzhi, et al.
Published: (2023)
by: Xu, Fangzhi, et al.
Published: (2023)
A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks
by: Jahan, Israt, et al.
Published: (2023)
by: Jahan, Israt, et al.
Published: (2023)
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
by: Chu, Zheng, et al.
Published: (2023)
by: Chu, Zheng, et al.
Published: (2023)
ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models
by: Nguyen, Trong-Hieu, et al.
Published: (2024)
by: Nguyen, Trong-Hieu, et al.
Published: (2024)
TCMBench: A Comprehensive Benchmark for Evaluating Large Language Models in Traditional Chinese Medicine
by: Yue, Wenjing, et al.
Published: (2024)
by: Yue, Wenjing, et al.
Published: (2024)
Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling
by: Qiao, Zile, et al.
Published: (2024)
by: Qiao, Zile, et al.
Published: (2024)
Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing
by: Zhang, Hao, et al.
Published: (2025)
by: Zhang, Hao, et al.
Published: (2025)
Anonymity at Risk? Assessing Re-Identification Capabilities of Large Language Models
by: Nyffenegger, Alex, et al.
Published: (2023)
by: Nyffenegger, Alex, et al.
Published: (2023)
Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation
by: Gain, Baban, et al.
Published: (2025)
by: Gain, Baban, et al.
Published: (2025)
Similar Items
-
Crossing Linguistic Horizons: Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models
by: Truong, Sang T., et al.
Published: (2024) -
Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
by: Zhou, Kevin, et al.
Published: (2025) -
Towards LLM-based Autograding for Short Textual Answers
by: Schneider, Johannes, et al.
Published: (2023) -
Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents
by: Yehudai, Asaf, et al.
Published: (2026) -
Automatic Input Rewriting Improves Translation with Large Language Models
by: Ki, Dayeon, et al.
Published: (2025)