Saved in:
| Main Authors: | Ku, Alexander Y., Griffiths, Thomas L., Chan, Stephanie C. Y. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.09855 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025)
by: Lampinen, Andrew K., et al.
Published: (2025)
Uncovering Competency Gaps in Large Language Models and Their Benchmarks
by: Bohacek, Maty, et al.
Published: (2025)
by: Bohacek, Maty, et al.
Published: (2025)
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
by: Geng, Jiayi, et al.
Published: (2025)
by: Geng, Jiayi, et al.
Published: (2025)
Language models show human-like content effects on reasoning tasks
by: Dasgupta, Ishita, et al.
Published: (2022)
by: Dasgupta, Ishita, et al.
Published: (2022)
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
by: Liu, Ryan, et al.
Published: (2024)
by: Liu, Ryan, et al.
Published: (2024)
Large Language Models Assume People are More Rational than We Really are
by: Liu, Ryan, et al.
Published: (2024)
by: Liu, Ryan, et al.
Published: (2024)
Rational Metareasoning for Large Language Models
by: De Sabbata, C. Nicolò, et al.
Published: (2024)
by: De Sabbata, C. Nicolò, et al.
Published: (2024)
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
by: Liang, Kaiqu, et al.
Published: (2025)
by: Liang, Kaiqu, et al.
Published: (2025)
Emergent Semantic Role Understanding in Language Models
by: Griffiths, Carla, et al.
Published: (2026)
by: Griffiths, Carla, et al.
Published: (2026)
Cognitive Architectures for Language Agents
by: Sumers, Theodore R., et al.
Published: (2023)
by: Sumers, Theodore R., et al.
Published: (2023)
Are Large Language Models Sensitive to the Motives Behind Communication?
by: Wu, Addison J., et al.
Published: (2025)
by: Wu, Addison J., et al.
Published: (2025)
A density estimation perspective on learning from pairwise human preferences
by: Dumoulin, Vincent, et al.
Published: (2023)
by: Dumoulin, Vincent, et al.
Published: (2023)
Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models
by: Liang, Kaiqu, et al.
Published: (2025)
by: Liang, Kaiqu, et al.
Published: (2025)
Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making
by: Hsu, Aliyah R., et al.
Published: (2023)
by: Hsu, Aliyah R., et al.
Published: (2023)
Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-Tuned Models
by: Arteaga, Gabriel Y., et al.
Published: (2024)
by: Arteaga, Gabriel Y., et al.
Published: (2024)
What Should Embeddings Embed? Autoregressive Models Represent Latent Generating Distributions
by: Zhang, Liyi, et al.
Published: (2024)
by: Zhang, Liyi, et al.
Published: (2024)
Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition
by: Hsu, Aliyah R., et al.
Published: (2024)
by: Hsu, Aliyah R., et al.
Published: (2024)
Analyzing the Roles of Language and Vision in Learning from Limited Data
by: Chen, Allison, et al.
Published: (2024)
by: Chen, Allison, et al.
Published: (2024)
Mind Your Step (by Step): Chain-of-Thought can Reduce Performance on Tasks where Thinking Makes Humans Worse
by: Liu, Ryan, et al.
Published: (2024)
by: Liu, Ryan, et al.
Published: (2024)
Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models
by: Zeng, Linda, et al.
Published: (2026)
by: Zeng, Linda, et al.
Published: (2026)
Identifying and Mitigating the Influence of the Prior Distribution in Large Language Models
by: Zhang, Liyi, et al.
Published: (2025)
by: Zhang, Liyi, et al.
Published: (2025)
Transforming Agency. On the mode of existence of Large Language Models
by: Barandiaran, Xabier E., et al.
Published: (2024)
by: Barandiaran, Xabier E., et al.
Published: (2024)
Is Child-Directed Speech Effective Training Data for Language Models?
by: Feng, Steven Y., et al.
Published: (2024)
by: Feng, Steven Y., et al.
Published: (2024)
Baby Scale: Investigating Models Trained on Individual Children's Language Input
by: Feng, Steven Y., et al.
Published: (2026)
by: Feng, Steven Y., et al.
Published: (2026)
On the Ability of Transformers to Verify Plans
by: Sarrof, Yash, et al.
Published: (2026)
by: Sarrof, Yash, et al.
Published: (2026)
Does Transformer Interpretability Transfer to RNNs?
by: Paulo, Gonçalo, et al.
Published: (2024)
by: Paulo, Gonçalo, et al.
Published: (2024)
STAT: Shrinking Transformers After Training
by: Flynn, Megan, et al.
Published: (2024)
by: Flynn, Megan, et al.
Published: (2024)
Intelligent Learning Rate Distribution to reduce Catastrophic Forgetting in Transformers
by: Kenneweg, Philip, et al.
Published: (2024)
by: Kenneweg, Philip, et al.
Published: (2024)
The Point of View of a Sentiment: Towards Clinician Bias Detection in Psychiatric Notes
by: Valentine, Alissa A., et al.
Published: (2024)
by: Valentine, Alissa A., et al.
Published: (2024)
The Condensate Theorem: Transformers are O(n), Not $O(n^2)$
by: Williams, Jorge L. Ruiz
Published: (2026)
by: Williams, Jorge L. Ruiz
Published: (2026)
Low-rank finetuning for LLMs: A fairness perspective
by: Das, Saswat, et al.
Published: (2024)
by: Das, Saswat, et al.
Published: (2024)
Learning is Forgetting: LLM Training As Lossy Compression
by: Conklin, Henry C., et al.
Published: (2026)
by: Conklin, Henry C., et al.
Published: (2026)
Disentangled Safety Adapters Enable Efficient Guardrails and Flexible Inference-Time Alignment
by: Krishna, Kundan, et al.
Published: (2025)
by: Krishna, Kundan, et al.
Published: (2025)
Advancing Event Forecasting through Massive Training of Large Language Models: Challenges, Solutions, and Broader Impacts
by: Lee, Sang-Woo, et al.
Published: (2025)
by: Lee, Sang-Woo, et al.
Published: (2025)
Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting
by: Hu, Michael Y., et al.
Published: (2025)
by: Hu, Michael Y., et al.
Published: (2025)
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
by: Shao, Zhihong, et al.
Published: (2024)
by: Shao, Zhihong, et al.
Published: (2024)
Faithfulness Evaluation for Decoder-only LLM Attributions with Controlled Retained Information
by: Huang, Xin, et al.
Published: (2026)
by: Huang, Xin, et al.
Published: (2026)
TPTT: Transforming Pretrained Transformers into Titans
by: Furfaro, Fabien
Published: (2025)
by: Furfaro, Fabien
Published: (2025)
The broader spectrum of in-context learning
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
by: Hu, Michael Y., et al.
Published: (2025)
by: Hu, Michael Y., et al.
Published: (2025)
Similar Items
-
On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025) -
Uncovering Competency Gaps in Large Language Models and Their Benchmarks
by: Bohacek, Maty, et al.
Published: (2025) -
Are Large Language Models Reliable AI Scientists? Assessing Reverse-Engineering of Black-Box Systems
by: Geng, Jiayi, et al.
Published: (2025) -
Language models show human-like content effects on reasoning tasks
by: Dasgupta, Ishita, et al.
Published: (2022) -
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
by: Liu, Ryan, et al.
Published: (2024)