Saved in:
| Main Authors: | Yang, Cai, Dou, Yao, Heineman, David, Wu, Xiaofeng, Xu, Wei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.10421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Minimum Bayes Risk Decoding with Multi-Prompt
by: Heineman, David, et al.
Published: (2024)
by: Heineman, David, et al.
Published: (2024)
Evaluating Large Language Models on Urdu Idiom Translation
by: Khan, Muhammad Farmal, et al.
Published: (2025)
by: Khan, Muhammad Farmal, et al.
Published: (2025)
Gavel: Agent Meets Checklist for Evaluating LLMs on Long-Context Legal Summarization
by: Dou, Yao, et al.
Published: (2026)
by: Dou, Yao, et al.
Published: (2026)
Creative and Context-Aware Translation of East Asian Idioms with GPT-4
by: Tang, Kenan, et al.
Published: (2024)
by: Tang, Kenan, et al.
Published: (2024)
It's Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems
by: Zaitova, Iuliia, et al.
Published: (2025)
by: Zaitova, Iuliia, et al.
Published: (2025)
The Impact of Visual Information in Chinese Characters: Evaluating Large Models' Ability to Recognize and Utilize Radicals
by: Wu, Xiaofeng, et al.
Published: (2024)
by: Wu, Xiaofeng, et al.
Published: (2024)
Memorization or Reasoning? Exploring the Idiom Understanding of LLMs
by: Kim, Jisu, et al.
Published: (2025)
by: Kim, Jisu, et al.
Published: (2025)
Readability-guided Idiom-aware Sentence Simplification (RISS) for Chinese
by: Zhang, Jingshen, et al.
Published: (2024)
by: Zhang, Jingshen, et al.
Published: (2024)
Chengyu-Bench: Benchmarking Large Language Models for Chinese Idiom Understanding and Use
by: Fu, Yicheng, et al.
Published: (2025)
by: Fu, Yicheng, et al.
Published: (2025)
Large Language Models for Persian $ \leftrightarrow $ English Idiom Translation
by: Rezaeimanesh, Sara, et al.
Published: (2024)
by: Rezaeimanesh, Sara, et al.
Published: (2024)
Towards a Path Dependent Account of Category Fluency
by: Heineman, David, et al.
Published: (2024)
by: Heineman, David, et al.
Published: (2024)
Idiom Understanding as a Tool to Measure the Dialect Gap
by: Beauchemin, David, et al.
Published: (2025)
by: Beauchemin, David, et al.
Published: (2025)
Idiom Detection in Sorani Kurdish Texts
by: Omer, Skala Kamaran, et al.
Published: (2025)
by: Omer, Skala Kamaran, et al.
Published: (2025)
A Rising Tide Lifts All Boats: MTQE Rewards for Idioms Improve General Translation Quality
by: Agarwal, Ishika, et al.
Published: (2026)
by: Agarwal, Ishika, et al.
Published: (2026)
DualCoTs: Dual Chain-of-Thoughts Prompting for Sentiment Lexicon Expansion of Idioms
by: Niu, Fuqiang, et al.
Published: (2024)
by: Niu, Fuqiang, et al.
Published: (2024)
Tabular Data Understanding with LLMs: A Survey of Recent Advances and Challenges
by: Wu, Xiaofeng, et al.
Published: (2025)
by: Wu, Xiaofeng, et al.
Published: (2025)
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
by: Zhang, Ran, et al.
Published: (2024)
by: Zhang, Ran, et al.
Published: (2024)
NLP Datasets for Idiom and Figurative Language Tasks
by: Matheny, Blake, et al.
Published: (2025)
by: Matheny, Blake, et al.
Published: (2025)
A Survey of Idiom Datasets for Psycholinguistic and Computational Research
by: Flor, Michael, et al.
Published: (2025)
by: Flor, Michael, et al.
Published: (2025)
Killing Two Flies with One Stone: An Attempt to Break LLMs Using English->Icelandic Idioms and Proper Names
by: Ármannsson, Bjarki, et al.
Published: (2024)
by: Ármannsson, Bjarki, et al.
Published: (2024)
DMDTEval: An Evaluation and Analysis of LLMs on Disambiguation in Multi-domain Translation
by: Man, Zhibo, et al.
Published: (2025)
by: Man, Zhibo, et al.
Published: (2025)
Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge
by: Cai, Yunna, et al.
Published: (2025)
by: Cai, Yunna, et al.
Published: (2025)
ClinConsensus: A Physician-Calibrated Benchmark for Evaluating Clinical Rubric Coverage in Chinese Medical LLMs
by: Zheng, Xiang, et al.
Published: (2026)
by: Zheng, Xiang, et al.
Published: (2026)
Comparative Study of Multilingual Idioms and Similes in Large Language Models
by: Khoshtab, Paria, et al.
Published: (2024)
by: Khoshtab, Paria, et al.
Published: (2024)
Can LLMs Act as Historians? Evaluating Historical Research Capabilities of LLMs via the Chinese Imperial Examination
by: Gao, Lirong, et al.
Published: (2026)
by: Gao, Lirong, et al.
Published: (2026)
Visual Puns from Idioms: An Iterative LLM-T2IM-MLLM Framework
by: Xiao, Kelaiti, et al.
Published: (2025)
by: Xiao, Kelaiti, et al.
Published: (2025)
AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models
by: Wei, Yuting, et al.
Published: (2024)
by: Wei, Yuting, et al.
Published: (2024)
Benchmarking Machine Translation on Chinese Social Media Texts
by: Zhao, Kaiyan, et al.
Published: (2026)
by: Zhao, Kaiyan, et al.
Published: (2026)
Machine Translation Evaluation Benchmark for Wu Chinese: Workflow and Analysis
by: Yu, Hongjian, et al.
Published: (2024)
by: Yu, Hongjian, et al.
Published: (2024)
CTourLLM: Enhancing LLMs with Chinese Tourism Knowledge
by: Wei, Qikai, et al.
Published: (2024)
by: Wei, Qikai, et al.
Published: (2024)
General2Specialized LLMs Translation for E-commerce
by: Chen, Kaidi, et al.
Published: (2024)
by: Chen, Kaidi, et al.
Published: (2024)
Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving
by: Chen, Andong, et al.
Published: (2024)
by: Chen, Andong, et al.
Published: (2024)
SimulatorArena: Are User Simulators Reliable Proxies for Multi-Turn Evaluation of AI Assistants?
by: Dou, Yao, et al.
Published: (2025)
by: Dou, Yao, et al.
Published: (2025)
Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation
by: Heineman, David, et al.
Published: (2025)
by: Heineman, David, et al.
Published: (2025)
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
by: Sun, Jiaxing, et al.
Published: (2024)
by: Sun, Jiaxing, et al.
Published: (2024)
Benchmarking the Detection of LLMs-Generated Modern Chinese Poetry
by: Wang, Shanshan, et al.
Published: (2025)
by: Wang, Shanshan, et al.
Published: (2025)
When Words Don't Mean What They Say: Figurative Understanding in Bengali Idioms
by: Sakhawat, Adib, et al.
Published: (2026)
by: Sakhawat, Adib, et al.
Published: (2026)
A Hard Nut to Crack: Idiom Detection with Conversational Large Language Models
by: Fornaciari, Francesca De Luca, et al.
Published: (2024)
by: Fornaciari, Francesca De Luca, et al.
Published: (2024)
Anatomy of an Idiom: Tracing Non-Compositionality in Language Models
by: Gomes, Andrew
Published: (2025)
by: Gomes, Andrew
Published: (2025)
Unveiling the Competitive Dynamics: A Comparative Evaluation of American and Chinese LLMs
by: Jiang, Zhenhui, et al.
Published: (2024)
by: Jiang, Zhenhui, et al.
Published: (2024)
Similar Items
-
Improving Minimum Bayes Risk Decoding with Multi-Prompt
by: Heineman, David, et al.
Published: (2024) -
Evaluating Large Language Models on Urdu Idiom Translation
by: Khan, Muhammad Farmal, et al.
Published: (2025) -
Gavel: Agent Meets Checklist for Evaluating LLMs on Long-Context Legal Summarization
by: Dou, Yao, et al.
Published: (2026) -
Creative and Context-Aware Translation of East Asian Idioms with GPT-4
by: Tang, Kenan, et al.
Published: (2024) -
It's Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems
by: Zaitova, Iuliia, et al.
Published: (2025)