Saved in:
| Main Authors: | de la Fuente, Antón, Jurafsky, Dan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.13678 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
by: Fang, Ying, et al.
Published: (2025)
by: Fang, Ying, et al.
Published: (2025)
Tone recognition in low-resource languages of North-East India: peeling the layers of SSL-based speech models
by: Gogoi, Parismita, et al.
Published: (2025)
by: Gogoi, Parismita, et al.
Published: (2025)
Humans overrely on overconfident language models, across languages
by: Rathi, Neil, et al.
Published: (2025)
by: Rathi, Neil, et al.
Published: (2025)
Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models
by: Luo, Yiwei, et al.
Published: (2023)
by: Luo, Yiwei, et al.
Published: (2023)
An efficient text augmentation approach for contextualized Mandarin speech recognition
by: Zheng, Naijun, et al.
Published: (2024)
by: Zheng, Naijun, et al.
Published: (2024)
Introducing MELI: the Mandarin-English Language Interview Corpus
by: Liu, Suyuan, et al.
Published: (2026)
by: Liu, Suyuan, et al.
Published: (2026)
Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
by: San, Nay, et al.
Published: (2024)
by: San, Nay, et al.
Published: (2024)
Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models
by: Zhang, Christine, et al.
Published: (2026)
by: Zhang, Christine, et al.
Published: (2026)
SumTablets: A Transliteration Dataset of Sumerian Tablets
by: Simmons, Cole, et al.
Published: (2026)
by: Simmons, Cole, et al.
Published: (2026)
Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin
by: Rao, Abhinav, et al.
Published: (2022)
by: Rao, Abhinav, et al.
Published: (2022)
HumT DumT: Measuring and controlling human-like language in LLMs
by: Cheng, Myra, et al.
Published: (2025)
by: Cheng, Myra, et al.
Published: (2025)
Accommodation and Epistemic Vigilance: A Pragmatic Account of Why LLMs Fail to Challenge Harmful Beliefs
by: Cheng, Myra, et al.
Published: (2026)
by: Cheng, Myra, et al.
Published: (2026)
AnthroScore: A Computational Linguistic Measure of Anthropomorphism
by: Cheng, Myra, et al.
Published: (2024)
by: Cheng, Myra, et al.
Published: (2024)
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
by: Kallini, Julie, et al.
Published: (2025)
by: Kallini, Julie, et al.
Published: (2025)
Automated evaluation of LLMs for effective machine translation of Mandarin Chinese to English
by: Zhang, Yue, et al.
Published: (2026)
by: Zhang, Yue, et al.
Published: (2026)
Multilingual Stutter Event Detection for English, German, and Mandarin Speech
by: Haas, Felix, et al.
Published: (2026)
by: Haas, Felix, et al.
Published: (2026)
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
Learning the meanings of function words from grounded language using a visual question answering model
by: Portelance, Eva, et al.
Published: (2023)
by: Portelance, Eva, et al.
Published: (2023)
Data Checklist: On Unit-Testing Datasets with Usable Information
by: Zhang, Heidi C., et al.
Published: (2024)
by: Zhang, Heidi C., et al.
Published: (2024)
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech
by: Wotherspoon, Shannon, et al.
Published: (2024)
by: Wotherspoon, Shannon, et al.
Published: (2024)
Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs
by: Quang, Trung Nguyen, et al.
Published: (2026)
by: Quang, Trung Nguyen, et al.
Published: (2026)
CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
by: Liu, Heyang, et al.
Published: (2025)
by: Liu, Heyang, et al.
Published: (2025)
What can large language models do for sustainable food?
by: Thomas, Anna T., et al.
Published: (2025)
by: Thomas, Anna T., et al.
Published: (2025)
A Benchmark for Learning to Translate a New Language from One Grammar Book
by: Tanzer, Garrett, et al.
Published: (2023)
by: Tanzer, Garrett, et al.
Published: (2023)
Dialect prejudice predicts AI decisions about people's character, employability, and criminality
by: Hofmann, Valentin, et al.
Published: (2024)
by: Hofmann, Valentin, et al.
Published: (2024)
Beyond Tokens: Concept-Level Training Objectives for LLMs
by: Iyer, Laya, et al.
Published: (2026)
by: Iyer, Laya, et al.
Published: (2026)
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
by: Shani, Chen, et al.
Published: (2026)
by: Shani, Chen, et al.
Published: (2026)
Form and meaning co-determine the realization of tone in Taiwan Mandarin spontaneous speech: the case of T2-T3 and T3-T3 tone sandhi
by: Lu, Yuxin, et al.
Published: (2024)
by: Lu, Yuxin, et al.
Published: (2024)
Rethinking Word Similarity: Semantic Similarity through Classification Confusion
by: Zhou, Kaitlyn, et al.
Published: (2025)
by: Zhou, Kaitlyn, et al.
Published: (2025)
Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
by: Yu, Sunny, et al.
Published: (2025)
by: Yu, Sunny, et al.
Published: (2025)
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory
by: Suzgun, Mirac, et al.
Published: (2025)
by: Suzgun, Mirac, et al.
Published: (2025)
Bayesian scaling laws for in-context learning
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
Mechanistic evaluation of Transformers and state space models
by: Arora, Aryaman, et al.
Published: (2025)
by: Arora, Aryaman, et al.
Published: (2025)
Word-specific tonal realizations in Mandarin
by: Chuang, Yu-Ying, et al.
Published: (2024)
by: Chuang, Yu-Ying, et al.
Published: (2024)
Measuring Taiwanese Mandarin Language Understanding
by: Chen, Po-Heng, et al.
Published: (2024)
by: Chen, Po-Heng, et al.
Published: (2024)
The mutual exclusivity bias of bilingual visually grounded speech models
by: Oneata, Dan, et al.
Published: (2025)
by: Oneata, Dan, et al.
Published: (2025)
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps
by: Gligoric, Kristina, et al.
Published: (2024)
by: Gligoric, Kristina, et al.
Published: (2024)
Grounding Gaps in Language Model Generations
by: Shaikh, Omar, et al.
Published: (2023)
by: Shaikh, Omar, et al.
Published: (2023)
Translating speech with just images
by: Oneata, Dan, et al.
Published: (2024)
by: Oneata, Dan, et al.
Published: (2024)
Similar Items
-
UMA-Split: unimodal aggregation for both English and Mandarin non-autoregressive speech recognition
by: Fang, Ying, et al.
Published: (2025) -
Tone recognition in low-resource languages of North-East India: peeling the layers of SSL-based speech models
by: Gogoi, Parismita, et al.
Published: (2025) -
Humans overrely on overconfident language models, across languages
by: Rathi, Neil, et al.
Published: (2025) -
Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models
by: Luo, Yiwei, et al.
Published: (2023) -
An efficient text augmentation approach for contextualized Mandarin speech recognition
by: Zheng, Naijun, et al.
Published: (2024)