:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Saba, Syeda Jannatus, Skiena, Steven
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2509.19611
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Gatsby Without the 'E': Crafting Lipograms with LLMs
by: Balasubramanian, Rohan, et al.
Published: (2025)

Reducing Tokenization Premiums for Low-Resource Languages
by: Churchill, Geoffrey, et al.
Published: (2026)

Hierarchies over Vector Space: Orienting Word and Graph Embeddings
by: Guo, Xingzhi, et al.
Published: (2022)

The Shape of Word Embeddings: Quantifying Non-Isometry With Topological Data Analysis
by: Draganov, Ondřej, et al.
Published: (2024)

Word Definitions from Large Language Models
by: Pham, Bach, et al.
Published: (2023)

The Telephone Game: Evaluating Semantic Drift in Unified Models
by: Mollah, Sabbir, et al.
Published: (2025)

Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale
by: Lang, Max M., et al.
Published: (2025)

Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech
by: Wotherspoon, Shannon, et al.
Published: (2024)

Role-Playing Evaluation for Large Language Models
by: Boudouri, Yassine El, et al.
Published: (2025)

Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data
by: Zou, Wei, et al.
Published: (2025)

Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models
by: Saba, Walid S.
Published: (2024)

Evaluating Large Language Models on Urdu Idiom Translation
by: Khan, Muhammad Farmal, et al.
Published: (2025)

Saying the Unsaid: Revealing the Hidden Language of Multimodal Systems Through Telephone Games
by: Zhao, Juntu, et al.
Published: (2025)

RPGBENCH: Evaluating Large Language Models as Role-Playing Game Engines
by: Yu, Pengfei, et al.
Published: (2025)

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation
by: Gusev, Ilya
Published: (2024)

Playing repeated games with Large Language Models
by: Akata, Elif, et al.
Published: (2023)

TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models
by: Ahn, Jaewoo, et al.
Published: (2024)

What do Large Language Models Need for Machine Translation Evaluation?
by: Qian, Shenbin, et al.
Published: (2024)

Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages
by: Mujadia, Vandan, et al.
Published: (2024)

Capturing Human Cognitive Styles with Language: Towards an Experimental Evaluation Paradigm
by: Varadarajan, Vasudha, et al.
Published: (2025)

Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model
by: Wang, Minghan, et al.
Published: (2025)

Ethical Considerations of Large Language Models in Game Playing
by: Zhang, Qingquan, et al.
Published: (2025)

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models
by: Ahmadi, Saba, et al.
Published: (2026)

LLM as a Broken Telephone: Iterative Generation Distorts Information
by: Mohamed, Amr, et al.
Published: (2025)

Evaluation of Pose Estimation Systems for Sign Language Translation
by: O'Brien, Catherine, et al.
Published: (2026)

X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
by: Xu, Haoran, et al.
Published: (2024)

Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models
by: Lu, Qingyu, et al.
Published: (2023)

The 2020s Political Economy of Machine Translation
by: Weber, Steven
Published: (2020)

Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay
by: de Carvalho, Gonçalo Hora, et al.
Published: (2024)

Evaluating the Translation Performance of Large Language Models Based on Euas-20
by: Huang, Yan, et al.
Published: (2024)

Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation
by: Huang, Xu, et al.
Published: (2024)

SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing
by: Oyeyemi, Dare Azeez, et al.
Published: (2024)

Input Matters: Evaluating Input Structure's Impact on LLM Summaries of Sports Play-by-Play
by: Sundararajan, Barkavi, et al.
Published: (2025)

Evaluating Temporal Consistency in Multi-Turn Language Models
by: Atri, Yash Kumar, et al.
Published: (2026)

A Critical Study of Automatic Evaluation in Sign Language Translation
by: Yazdani, Shakib, et al.
Published: (2025)

AI Telephone Surveying: Automating Quantitative Data Collection with an AI Interviewer
by: Leybzon, Danny D., et al.
Published: (2025)

An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics
by: Ahmadi, Saba, et al.
Published: (2023)

Unplug and Play Language Models: Decomposing Experts in Language Models at Inference Time
by: Yang, Nakyeong, et al.
Published: (2024)

Large Language Models for Classical Chinese Poetry Translation: Benchmarking, Evaluating, and Improving
by: Chen, Andong, et al.
Published: (2024)

Playing with Words, Improving with Rewards: Training Language Models for Creative Association
by: Deshpande, Vijeta, et al.
Published: (2026)