Saved in:
| Main Authors: | Iyer, Laya, Somani, Pranav, Guo, Alice, Jurafsky, Dan, Shani, Chen |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.11791 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models
by: Zhang, Christine, et al.
Published: (2026)
by: Zhang, Christine, et al.
Published: (2026)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
by: Shani, Chen, et al.
Published: (2025)
by: Shani, Chen, et al.
Published: (2025)
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
by: Shani, Chen, et al.
Published: (2026)
by: Shani, Chen, et al.
Published: (2026)
Rethinking Word Similarity: Semantic Similarity through Classification Confusion
by: Zhou, Kaitlyn, et al.
Published: (2025)
by: Zhou, Kaitlyn, et al.
Published: (2025)
Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination
by: Mizrahi, Moran, et al.
Published: (2025)
by: Mizrahi, Moran, et al.
Published: (2025)
ELEPHANT: Measuring and understanding social sycophancy in LLMs
by: Cheng, Myra, et al.
Published: (2025)
by: Cheng, Myra, et al.
Published: (2025)
Categorize Early, Integrate Late: Divergent Processing Strategies in Automatic Speech Recognition
by: Roll, Nathan, et al.
Published: (2026)
by: Roll, Nathan, et al.
Published: (2026)
HumT DumT: Measuring and controlling human-like language in LLMs
by: Cheng, Myra, et al.
Published: (2025)
by: Cheng, Myra, et al.
Published: (2025)
Accommodation and Epistemic Vigilance: A Pragmatic Account of Why LLMs Fail to Challenge Harmful Beliefs
by: Cheng, Myra, et al.
Published: (2026)
by: Cheng, Myra, et al.
Published: (2026)
Token-Level Uncertainty-Aware Objective for Language Model Post-Training
by: Liu, Tingkai, et al.
Published: (2025)
by: Liu, Tingkai, et al.
Published: (2025)
Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs
by: Nakkiran, Preetum, et al.
Published: (2025)
by: Nakkiran, Preetum, et al.
Published: (2025)
HEART: A Unified Benchmark for Assessing Humans and LLMs in Emotional Support Dialogue
by: Iyer, Laya, et al.
Published: (2026)
by: Iyer, Laya, et al.
Published: (2026)
A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models
by: de la Fuente, Antón, et al.
Published: (2024)
by: de la Fuente, Antón, et al.
Published: (2024)
Othering and low status framing of immigrant cuisines in US restaurant reviews and large language models
by: Luo, Yiwei, et al.
Published: (2023)
by: Luo, Yiwei, et al.
Published: (2023)
SumTablets: A Transliteration Dataset of Sumerian Tablets
by: Simmons, Cole, et al.
Published: (2026)
by: Simmons, Cole, et al.
Published: (2026)
Humans overrely on overconfident language models, across languages
by: Rathi, Neil, et al.
Published: (2025)
by: Rathi, Neil, et al.
Published: (2025)
Training LLMs Beyond Next Token Prediction -- Filling the Mutual Information Gap
by: Yang, Chun-Hao, et al.
Published: (2025)
by: Yang, Chun-Hao, et al.
Published: (2025)
False Friends Are Not Foes: Investigating Vocabulary Overlap in Multilingual Language Models
by: Kallini, Julie, et al.
Published: (2025)
by: Kallini, Julie, et al.
Published: (2025)
CausalGym: Benchmarking causal interpretability methods on linguistic tasks
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
by: Bianchi, Federico, et al.
Published: (2024)
by: Bianchi, Federico, et al.
Published: (2024)
AnthroScore: A Computational Linguistic Measure of Anthropomorphism
by: Cheng, Myra, et al.
Published: (2024)
by: Cheng, Myra, et al.
Published: (2024)
Transcribe, Translate, or Transliterate: An Investigation of Intermediate Representations in Spoken Language Models
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
by: Ògúnrèmí, Tolúlopé, et al.
Published: (2025)
Data Checklist: On Unit-Testing Datasets with Usable Information
by: Zhang, Heidi C., et al.
Published: (2024)
by: Zhang, Heidi C., et al.
Published: (2024)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Dialect prejudice predicts AI decisions about people's character, employability, and criminality
by: Hofmann, Valentin, et al.
Published: (2024)
by: Hofmann, Valentin, et al.
Published: (2024)
Distilling Token-Trained Models into Byte-Level Models
by: Bao, Zishuo, et al.
Published: (2026)
by: Bao, Zishuo, et al.
Published: (2026)
Beyond Token-Level Policy Gradients for Complex Reasoning with Large Language Models
by: Xu, Mufan, et al.
Published: (2026)
by: Xu, Mufan, et al.
Published: (2026)
A Benchmark for Learning to Translate a New Language from One Grammar Book
by: Tanzer, Garrett, et al.
Published: (2023)
by: Tanzer, Garrett, et al.
Published: (2023)
Verbalizing LLMs' assumptions to explain and control sycophancy
by: Cheng, Myra, et al.
Published: (2026)
by: Cheng, Myra, et al.
Published: (2026)
Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
by: Yu, Sunny, et al.
Published: (2025)
by: Yu, Sunny, et al.
Published: (2025)
Dynamic Cheatsheet: Test-Time Learning with Adaptive Memory
by: Suzgun, Mirac, et al.
Published: (2025)
by: Suzgun, Mirac, et al.
Published: (2025)
Bayesian scaling laws for in-context learning
by: Arora, Aryaman, et al.
Published: (2024)
by: Arora, Aryaman, et al.
Published: (2024)
Speculating LLMs' Chinese Training Data Pollution from Their Tokens
by: Zhang, Qingjie, et al.
Published: (2025)
by: Zhang, Qingjie, et al.
Published: (2025)
Learning the meanings of function words from grounded language using a visual question answering model
by: Portelance, Eva, et al.
Published: (2023)
by: Portelance, Eva, et al.
Published: (2023)
Why 1 + 1 < 1 in Visual Token Pruning: Beyond Naive Integration via Multi-Objective Balanced Covering
by: Li, Yangfu, et al.
Published: (2025)
by: Li, Yangfu, et al.
Published: (2025)
Measuring Mental Health Variables in Computational Research: Toward Validated, Dimensional, and Transdiagnostic Approaches
by: Shani, Chen, et al.
Published: (2025)
by: Shani, Chen, et al.
Published: (2025)
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps
by: Gligoric, Kristina, et al.
Published: (2024)
by: Gligoric, Kristina, et al.
Published: (2024)
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
by: Zhao, Haiyan, et al.
Published: (2024)
by: Zhao, Haiyan, et al.
Published: (2024)
Grounding Gaps in Language Model Generations
by: Shaikh, Omar, et al.
Published: (2023)
by: Shaikh, Omar, et al.
Published: (2023)
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation
by: Wei, Jingxuan, et al.
Published: (2024)
by: Wei, Jingxuan, et al.
Published: (2024)
Similar Items
-
Learning Concepts, Not Tokens: Self-Supervised Semantic Alignment for Language Models
by: Zhang, Christine, et al.
Published: (2026) -
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning
by: Shani, Chen, et al.
Published: (2025) -
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
by: Shani, Chen, et al.
Published: (2026) -
Rethinking Word Similarity: Semantic Similarity through Classification Confusion
by: Zhou, Kaitlyn, et al.
Published: (2025) -
Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination
by: Mizrahi, Moran, et al.
Published: (2025)