Saved in:
| Main Authors: | Michaelov, James A., Arnett, Catherine, Bergen, Benjamin K. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.19178 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the Acquisition of Shared Grammatical Representations in Bilingual Language Models
by: Arnett, Catherine, et al.
Published: (2025)
by: Arnett, Catherine, et al.
Published: (2025)
Disaggregation Reveals Hidden Training Dynamics: The Case of Agreement Attraction
by: Michaelov, James A., et al.
Published: (2025)
by: Michaelov, James A., et al.
Published: (2025)
Emergent inabilities? Inverse scaling over the course of pretraining
by: Michaelov, James A., et al.
Published: (2023)
by: Michaelov, James A., et al.
Published: (2023)
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
by: Michaelov, James A., et al.
Published: (2025)
by: Michaelov, James A., et al.
Published: (2025)
Why do language models perform worse for morphologically complex languages?
by: Arnett, Catherine, et al.
Published: (2024)
by: Arnett, Catherine, et al.
Published: (2024)
Goldfish: Monolingual Language Models for 350 Languages
by: Chang, Tyler A., et al.
Published: (2024)
by: Chang, Tyler A., et al.
Published: (2024)
A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages
by: Arnett, Catherine, et al.
Published: (2024)
by: Arnett, Catherine, et al.
Published: (2024)
Not quite Sherlock Holmes: Language model predictions do not reliably differentiate impossible from improbable events
by: Michaelov, James A., et al.
Published: (2025)
by: Michaelov, James A., et al.
Published: (2025)
N-gram-like Language Models Predict Reading Time Best
by: Michaelov, James A., et al.
Published: (2026)
by: Michaelov, James A., et al.
Published: (2026)
How Open Must Language Models be to Enable Reliable Scientific Inference?
by: Michaelov, James A., et al.
Published: (2026)
by: Michaelov, James A., et al.
Published: (2026)
Explaining and Mitigating Crosslingual Tokenizer Inequities
by: Arnett, Catherine, et al.
Published: (2025)
by: Arnett, Catherine, et al.
Published: (2025)
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
by: Chang, Tyler A., et al.
Published: (2025)
by: Chang, Tyler A., et al.
Published: (2025)
Large Language Models Pass the Turing Test
by: Jones, Cameron R., et al.
Published: (2025)
by: Jones, Cameron R., et al.
Published: (2025)
Do Large Language Models Exhibit Spontaneous Rational Deception?
by: Taylor, Samuel M., et al.
Published: (2025)
by: Taylor, Samuel M., et al.
Published: (2025)
Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability
by: Chang, Tyler A., et al.
Published: (2023)
by: Chang, Tyler A., et al.
Published: (2023)
BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization
by: Land, Sander, et al.
Published: (2025)
by: Land, Sander, et al.
Published: (2025)
Evaluating Morphological Alignment of Tokenizers in 70 Languages
by: Arnett, Catherine, et al.
Published: (2025)
by: Arnett, Catherine, et al.
Published: (2025)
Language Statistics and False Belief Reasoning: Evidence from 41 Open-Weight LMs
by: Trott, Sean, et al.
Published: (2026)
by: Trott, Sean, et al.
Published: (2026)
Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models
by: Jones, Cameron R., et al.
Published: (2024)
by: Jones, Cameron R., et al.
Published: (2024)
Computational Sentence-level Metrics Predicting Human Sentence Comprehension
by: Sun, Kun, et al.
Published: (2024)
by: Sun, Kun, et al.
Published: (2024)
EVOKE: Emotion Vocabulary Of Korean and English
by: Jung, Yoonwon, et al.
Published: (2026)
by: Jung, Yoonwon, et al.
Published: (2026)
Does GPT-4 pass the Turing test?
by: Jones, Cameron R., et al.
Published: (2023)
by: Jones, Cameron R., et al.
Published: (2023)
Weight Tying Biases Token Embeddings Towards the Output Space
by: Lopardo, Antonio, et al.
Published: (2026)
by: Lopardo, Antonio, et al.
Published: (2026)
Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
by: Arnett, Catherine, et al.
Published: (2024)
by: Arnett, Catherine, et al.
Published: (2024)
Intra-Layer Recurrence in Transformers for Language Modeling
by: Nguyen, Anthony, et al.
Published: (2025)
by: Nguyen, Anthony, et al.
Published: (2025)
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training
by: Chizhov, Pavel, et al.
Published: (2024)
by: Chizhov, Pavel, et al.
Published: (2024)
Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer
by: Zhang, Xiang, et al.
Published: (2024)
by: Zhang, Xiang, et al.
Published: (2024)
Toxicity of the Commons: Curating Open-Source Pre-Training Data
by: Arnett, Catherine, et al.
Published: (2024)
by: Arnett, Catherine, et al.
Published: (2024)
Dissecting the Ullman Variations with a SCALPEL: Why do LLMs fail at Trivial Alterations to the False Belief Task?
by: Pi, Zhiqiang, et al.
Published: (2024)
by: Pi, Zhiqiang, et al.
Published: (2024)
Will Large Language Models Transform Clinical Prediction?
by: Yildiz, Yusuf, et al.
Published: (2025)
by: Yildiz, Yusuf, et al.
Published: (2025)
GPT-4 is judged more human than humans in displaced and inverted Turing tests
by: Rathi, Ishika, et al.
Published: (2024)
by: Rathi, Ishika, et al.
Published: (2024)
Diverging Transformer Predictions for Human Sentence Processing: A Comprehensive Analysis of Agreement Attraction Effects
by: von der Malsburg, Titus, et al.
Published: (2026)
by: von der Malsburg, Titus, et al.
Published: (2026)
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
by: Botev, Aleksandar, et al.
Published: (2024)
by: Botev, Aleksandar, et al.
Published: (2024)
Re-defining Humor Data Objects for AI Humor Research
by: Arnett, Anna, et al.
Published: (2026)
by: Arnett, Anna, et al.
Published: (2026)
A Comprehensive Evaluation of Semantic Relation Knowledge of Pretrained Language Models and Humans
by: Cao, Zhihan, et al.
Published: (2024)
by: Cao, Zhihan, et al.
Published: (2024)
Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
by: Dejl, Adam, et al.
Published: (2025)
by: Dejl, Adam, et al.
Published: (2025)
An Algebraic View of the Expressivity of Recurrent Language Models
by: Nowak, Franz, et al.
Published: (2026)
by: Nowak, Franz, et al.
Published: (2026)
Can Large Language Models Match the Conclusions of Systematic Reviews?
by: Polzak, Christopher, et al.
Published: (2025)
by: Polzak, Christopher, et al.
Published: (2025)
Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study
by: Faria, Fatema Tuj Johora, et al.
Published: (2024)
by: Faria, Fatema Tuj Johora, et al.
Published: (2024)
LLMs and people both learn to form conventions -- just not with each other
by: Jones, Cameron R., et al.
Published: (2026)
by: Jones, Cameron R., et al.
Published: (2026)
Similar Items
-
On the Acquisition of Shared Grammatical Representations in Bilingual Language Models
by: Arnett, Catherine, et al.
Published: (2025) -
Disaggregation Reveals Hidden Training Dynamics: The Case of Agreement Attraction
by: Michaelov, James A., et al.
Published: (2025) -
Emergent inabilities? Inverse scaling over the course of pretraining
by: Michaelov, James A., et al.
Published: (2023) -
Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale
by: Michaelov, James A., et al.
Published: (2025) -
Why do language models perform worse for morphologically complex languages?
by: Arnett, Catherine, et al.
Published: (2024)