Saved in:
| Main Authors: | Simmons, Cole, Martinez, Richard Diehl, Jurafsky, Dan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.22200 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
Data Checklist: On Unit-Testing Datasets with Usable Information
by: Zhang, Heidi C., et al.
Published: (2024)
by: Zhang, Heidi C., et al.
Published: (2024)
A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models
by: de la Fuente, Antón, et al.
Published: (2024)
by: de la Fuente, Antón, et al.
Published: (2024)
Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models
by: Zhang, Christine, et al.
Published: (2026)
by: Zhang, Christine, et al.
Published: (2026)
HumT DumT: Measuring and controlling human-like language in LLMs
by: Cheng, Myra, et al.
Published: (2025)
by: Cheng, Myra, et al.
Published: (2025)
BanTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla
by: Haider, Fabiha, et al.
Published: (2024)
by: Haider, Fabiha, et al.
Published: (2024)
Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models
by: Luo, Yiwei, et al.
Published: (2023)
by: Luo, Yiwei, et al.
Published: (2023)
Accommodation and Epistemic Vigilance: A Pragmatic Account of Why LLMs Fail to Challenge Harmful Beliefs
by: Cheng, Myra, et al.
Published: (2026)
by: Cheng, Myra, et al.
Published: (2026)
Humans overrely on overconfident language models, across languages
by: Rathi, Neil, et al.
Published: (2025)
by: Rathi, Neil, et al.
Published: (2025)
Do "New Snow Tablets" Contain Snow? Large Language Models Over-Rely on Names to Identify Ingredients of Chinese Drugs
by: Li, Sifan, et al.
Published: (2025)
by: Li, Sifan, et al.
Published: (2025)
AnthroScore: A Computational Linguistic Measure of Anthropomorphism
by: Cheng, Myra, et al.
Published: (2024)
by: Cheng, Myra, et al.
Published: (2024)
How Transliterations Improve Crosslingual Alignment
by: Liu, Yihong, et al.
Published: (2024)
by: Liu, Yihong, et al.
Published: (2024)
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
by: Kallini, Julie, et al.
Published: (2025)
by: Kallini, Julie, et al.
Published: (2025)
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
CES 2011: Tablet Crazy
by: Rapp, David
Published: (2011)
by: Rapp, David
Published: (2011)
Connecting the Persian-speaking World through Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)
by: Merchant, Rayyan, et al.
Published: (2025)
Jailbreaking LLMs with Arabic Transliteration and Arabizi
by: Ghanim, Mansour Al, et al.
Published: (2024)
by: Ghanim, Mansour Al, et al.
Published: (2024)
Language Detection for Transliterated Content
by: S, Selva Kumar, et al.
Published: (2024)
by: S, Selva Kumar, et al.
Published: (2024)
ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)
by: Merchant, Rayyan, et al.
Published: (2025)
Swa Bhasha: Message-Based Singlish to Sinhala Transliteration
by: Athukorala, Maneesha U., et al.
Published: (2024)
by: Athukorala, Maneesha U., et al.
Published: (2024)
A Benchmark for Learning to Translate a New Language from One Grammar Book
by: Tanzer, Garrett, et al.
Published: (2023)
by: Tanzer, Garrett, et al.
Published: (2023)
Dialect prejudice predicts AI decisions about people's character, employability, and criminality
by: Hofmann, Valentin, et al.
Published: (2024)
by: Hofmann, Valentin, et al.
Published: (2024)
Beyond Tokens: Concept-Level Training Objectives for LLMs
by: Iyer, Laya, et al.
Published: (2026)
by: Iyer, Laya, et al.
Published: (2026)
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
by: Shani, Chen, et al.
Published: (2026)
by: Shani, Chen, et al.
Published: (2026)
Scripts Through Time: A Survey of the Evolving Role of Transliteration in NLP
by: Jayakumar, Thanmay, et al.
Published: (2026)
by: Jayakumar, Thanmay, et al.
Published: (2026)
A Tale of Two Scripts: Transliteration and Post-Correction for Judeo-Arabic
by: Gonzalez, Juan Moreno, et al.
Published: (2025)
by: Gonzalez, Juan Moreno, et al.
Published: (2025)
Tending Towards Stability: Convergence Challenges in Small Language Models
by: Martinez, Richard Diehl, et al.
Published: (2024)
by: Martinez, Richard Diehl, et al.
Published: (2024)
AyutthayaAlpha: A Thai-Latin Script Transliteration Transformer
by: Lauc, Davor, et al.
Published: (2024)
by: Lauc, Davor, et al.
Published: (2024)
Happiness is Sharing a Vocabulary: A Study of Transliteration Methods
by: Jung, Haeji, et al.
Published: (2025)
by: Jung, Haeji, et al.
Published: (2025)
Rethinking Word Similarity: Semantic Similarity through Classification Confusion
by: Zhou, Kaitlyn, et al.
Published: (2025)
by: Zhou, Kaitlyn, et al.
Published: (2025)
Beyond Specialization: Benchmarking LLMs for Transliteration of Indian Languages
by: Azam, Gulfarogh, et al.
Published: (2025)
by: Azam, Gulfarogh, et al.
Published: (2025)
Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
by: Yu, Sunny, et al.
Published: (2025)
by: Yu, Sunny, et al.
Published: (2025)
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory
by: Suzgun, Mirac, et al.
Published: (2025)
by: Suzgun, Mirac, et al.
Published: (2025)
Bayesian scaling laws for in-context learning
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
Romanized to Native Malayalam Script Transliteration Using an Encoder-Decoder Framework
by: Baiju, Bajiyo, et al.
Published: (2024)
by: Baiju, Bajiyo, et al.
Published: (2024)
Learning the meanings of function words from grounded language using a visual question answering model
by: Portelance, Eva, et al.
Published: (2023)
by: Portelance, Eva, et al.
Published: (2023)
Fractured Tablets
by: Balberg, Mira
Published: (2023)
by: Balberg, Mira
Published: (2023)
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps
by: Gligoric, Kristina, et al.
Published: (2024)
by: Gligoric, Kristina, et al.
Published: (2024)
Grounding Gaps in Language Model Generations
by: Shaikh, Omar, et al.
Published: (2023)
by: Shaikh, Omar, et al.
Published: (2023)
Linear Script Representations in Speech Foundation Models Enable Zero-Shot Transliteration
by: Shim, Ryan Soh-Eun, et al.
Published: (2026)
by: Shim, Ryan Soh-Eun, et al.
Published: (2026)
Similar Items
-
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025) -
Data Checklist: On Unit-Testing Datasets with Usable Information
by: Zhang, Heidi C., et al.
Published: (2024) -
A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models
by: de la Fuente, Antón, et al.
Published: (2024) -
Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models
by: Zhang, Christine, et al.
Published: (2026) -
HumT DumT: Measuring and controlling human-like language in LLMs
by: Cheng, Myra, et al.
Published: (2025)