Saved in:
| Main Authors: | Chaudhry, Arslan, Thiagarajan, Sridhar, Lampinen, Andrew |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.01430 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty
by: Chaudhry, Arslan, et al.
Published: (2024)
by: Chaudhry, Arslan, et al.
Published: (2024)
Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences
by: Lampinen, Andrew Kyle, et al.
Published: (2025)
by: Lampinen, Andrew Kyle, et al.
Published: (2025)
On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025)
by: Lampinen, Andrew K., et al.
Published: (2025)
Distinct Computations Emerge From Compositional Curricula in In-Context Learning
by: Lee, Jin Hwa, et al.
Published: (2025)
by: Lee, Jin Hwa, et al.
Published: (2025)
Interpretability Illusions in the Generalization of Simplified Models
by: Friedman, Dan, et al.
Published: (2023)
by: Friedman, Dan, et al.
Published: (2023)
Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context
by: Chaudhry, Faris, et al.
Published: (2026)
by: Chaudhry, Faris, et al.
Published: (2026)
ExpTest: Automating Learning Rate Searching and Tuning with Insights from Linearized Neural Networks
by: Chaudhry, Zan, et al.
Published: (2024)
by: Chaudhry, Zan, et al.
Published: (2024)
Jensen-Shannon Divergence Based Novel Loss Functions for Bayesian Neural Networks
by: Thiagarajan, Ponkrshnan, et al.
Published: (2022)
by: Thiagarajan, Ponkrshnan, et al.
Published: (2022)
Learned feature representations are biased by complexity, learning order, position, and more
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
General Preference Reinforcement Learning
by: Umer, Muhammad, et al.
Published: (2026)
by: Umer, Muhammad, et al.
Published: (2026)
Universal Reinforcement Learning in Coalgebras: Asynchronous Stochastic Computation via Conduction
by: Mahadevan, Sridhar
Published: (2025)
by: Mahadevan, Sridhar
Published: (2025)
A Comparative Analysis of LLM Adaptation: SFT, LoRA, and ICL in Data-Scarce Scenarios
by: Bohnet, Bernd, et al.
Published: (2025)
by: Bohnet, Bernd, et al.
Published: (2025)
The Geometry of Projection Heads: Conditioning, Invariance, and Collapse
by: Chaudhry, Faris
Published: (2026)
by: Chaudhry, Faris
Published: (2026)
Non-Interfering Weight Fields: Treating Model Parameters as a Continuously Extensible Function
by: Chaudhry, Sarim
Published: (2026)
by: Chaudhry, Sarim
Published: (2026)
Asymptotic and Finite-Time Guarantees for Langevin-Based Temperature Annealing in InfoNCE
by: Chaudhry, Faris
Published: (2026)
by: Chaudhry, Faris
Published: (2026)
Linear representations in language models can change dramatically over a conversation
by: Lampinen, Andrew Kyle, et al.
Published: (2026)
by: Lampinen, Andrew Kyle, et al.
Published: (2026)
Representation biases: will we achieve complete understanding by analyzing representations?
by: Lampinen, Andrew Kyle, et al.
Published: (2025)
by: Lampinen, Andrew Kyle, et al.
Published: (2025)
The in-context inductive biases of vision-language models differ across modalities
by: Allen, Kelsey, et al.
Published: (2025)
by: Allen, Kelsey, et al.
Published: (2025)
Improving Routability Prediction via NAS Using a Smooth One-shot Augmented Predictor
by: Sridhar, Arjun, et al.
Published: (2024)
by: Sridhar, Arjun, et al.
Published: (2024)
GAIA: Categorical Foundations of Generative AI
by: Mahadevan, Sridhar
Published: (2024)
by: Mahadevan, Sridhar
Published: (2024)
ChronoPlastic Spiking Neural Networks
by: Chaudhry, Sarim
Published: (2025)
by: Chaudhry, Sarim
Published: (2025)
A Simple Sparse Matrix Vector Multiplication Approach to Padded Convolution
by: Chaudhry, Zan
Published: (2024)
by: Chaudhry, Zan
Published: (2024)
The broader spectrum of in-context learning
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
by: Lampinen, Andrew Kyle, et al.
Published: (2024)
How do language models learn facts? Dynamics, curricula and hallucinations
by: Zucchet, Nicolas, et al.
Published: (2025)
by: Zucchet, Nicolas, et al.
Published: (2025)
Latent Regularization in Generative Test Input Generation
by: Merabishvili, Giorgi, et al.
Published: (2026)
by: Merabishvili, Giorgi, et al.
Published: (2026)
Recursive Concept Evolution for Compositional Reasoning in Large Language Models
by: Chaudhry, Sarim
Published: (2026)
by: Chaudhry, Sarim
Published: (2026)
Scaling Laws and Pathologies of Single-Layer PINNs: Network Width and PDE Nonlinearity
by: Chaudhry, Faris
Published: (2026)
by: Chaudhry, Faris
Published: (2026)
Data Distribution-based Curriculum Learning
by: Chaudhry, Shonal, et al.
Published: (2024)
by: Chaudhry, Shonal, et al.
Published: (2024)
The emergence of sparse attention: impact of data distribution and benefits of repetition
by: Zucchet, Nicolas, et al.
Published: (2025)
by: Zucchet, Nicolas, et al.
Published: (2025)
Context Sensitivity Improves Human-Machine Visual Alignment
by: Born, Frieda, et al.
Published: (2026)
by: Born, Frieda, et al.
Published: (2026)
Deep Learning-Based Visual Fatigue Detection Using Eye Gaze Patterns in VR
by: Zafar, Numan, et al.
Published: (2025)
by: Zafar, Numan, et al.
Published: (2025)
GLAD: Improving Latent Graph Generative Modeling with Simple Quantization
by: Nguyen, Van Khoa, et al.
Published: (2024)
by: Nguyen, Van Khoa, et al.
Published: (2024)
Universal Decision Learners
by: Mahadevan, Sridhar
Published: (2026)
by: Mahadevan, Sridhar
Published: (2026)
Kan Extension Transformers: A Categorical Unification of Attention, Diffusion, and Predict-Detach Self-Conditioning
by: Mahadevan, Sridhar
Published: (2026)
by: Mahadevan, Sridhar
Published: (2026)
PAGER: A Framework for Failure Analysis of Deep Regression Models
by: Thiagarajan, Jayaraman J., et al.
Published: (2023)
by: Thiagarajan, Jayaraman J., et al.
Published: (2023)
Bridging Quantum and Classical Computing in Drug Design: Architecture Principles for Improved Molecule Generation
by: Smith, Andrew, et al.
Published: (2025)
by: Smith, Andrew, et al.
Published: (2025)
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
by: Geiping, Jonas, et al.
Published: (2025)
by: Geiping, Jonas, et al.
Published: (2025)
`Eyes of a Hawk and Ears of a Fox': Part Prototype Network for Generalized Zero-Shot Learning
by: Feinglass, Joshua, et al.
Published: (2024)
by: Feinglass, Joshua, et al.
Published: (2024)
Enhancing Latent Computation in Transformers with Latent Tokens
by: Sun, Yuchang, et al.
Published: (2025)
by: Sun, Yuchang, et al.
Published: (2025)
PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents
by: Yan, Minghao, et al.
Published: (2026)
by: Yan, Minghao, et al.
Published: (2026)
Similar Items
-
Finetuning Language Models to Emit Linguistic Expressions of Uncertainty
by: Chaudhry, Arslan, et al.
Published: (2024) -
Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences
by: Lampinen, Andrew Kyle, et al.
Published: (2025) -
On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025) -
Distinct Computations Emerge From Compositional Curricula in In-Context Learning
by: Lee, Jin Hwa, et al.
Published: (2025) -
Interpretability Illusions in the Generalization of Simplified Models
by: Friedman, Dan, et al.
Published: (2023)