Saved in:
| Main Authors: | Tang, Cheng, Lake, Brenden, Jazayeri, Mehrdad |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.15801 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Rapid Word Learning Through Meta In-Context Learning
by: Wang, Wentao, et al.
Published: (2025)
by: Wang, Wentao, et al.
Published: (2025)
Are they human? Detecting large language models by probing human memory constraints
by: Schug, Simon, et al.
Published: (2026)
by: Schug, Simon, et al.
Published: (2026)
Compositional learning of functions in humans and machines
by: Zhou, Yanli, et al.
Published: (2024)
by: Zhou, Yanli, et al.
Published: (2024)
CoLLEGe: Concept Embedding Generation for Large Language Models
by: Teehan, Ryan, et al.
Published: (2024)
by: Teehan, Ryan, et al.
Published: (2024)
Overcoming classic challenges for artificial neural networks by providing incentives and practice
by: Irie, Kazuki, et al.
Published: (2024)
by: Irie, Kazuki, et al.
Published: (2024)
Aligned at the Start: Conceptual Groupings in LLM Embeddings
by: Khatir, Mehrdad, et al.
Published: (2024)
by: Khatir, Mehrdad, et al.
Published: (2024)
Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context
by: Alizadeh, Keivan, et al.
Published: (2026)
by: Alizadeh, Keivan, et al.
Published: (2026)
Detecting and explaining postpartum depression in real-time with generative artificial intelligence
by: García-Méndez, Silvia, et al.
Published: (2025)
by: García-Méndez, Silvia, et al.
Published: (2025)
A systematic investigation of learnability from single child linguistic input
by: Qin, Yulu, et al.
Published: (2024)
by: Qin, Yulu, et al.
Published: (2024)
Scaling sparse feature circuit finding for in-context learning
by: Kharlapenko, Dmitrii, et al.
Published: (2025)
by: Kharlapenko, Dmitrii, et al.
Published: (2025)
Out-of-distribution generalization via composition: a lens through induction heads in Transformers
by: Song, Jiajun, et al.
Published: (2024)
by: Song, Jiajun, et al.
Published: (2024)
Illuminate: A novel approach for depression detection with explainable analysis and proactive therapy using prompt engineering
by: Agrawal, Aryan
Published: (2024)
by: Agrawal, Aryan
Published: (2024)
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
by: Shojaee, Parshin, et al.
Published: (2025)
by: Shojaee, Parshin, et al.
Published: (2025)
TIDE: Every Layer Knows the Token Beneath the Context
by: Jaiswal, Ajay, et al.
Published: (2026)
by: Jaiswal, Ajay, et al.
Published: (2026)
Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)
When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)
by: Boix-Adsera, Enric, et al.
Published: (2023)
Large Language Models in Cybersecurity: State-of-the-Art
by: Motlagh, Farzad Nourmohammadzadeh, et al.
Published: (2024)
by: Motlagh, Farzad Nourmohammadzadeh, et al.
Published: (2024)
Accelerated Portfolio Optimization and Option Pricing with Reinforcement Learning
by: Keramati, Hadi, et al.
Published: (2025)
by: Keramati, Hadi, et al.
Published: (2025)
SUS backprop: linear backpropagation algorithm for long inputs in transformers
by: Pankov, Sergey, et al.
Published: (2025)
by: Pankov, Sergey, et al.
Published: (2025)
Reasoning's Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection
by: Chegini, Atoosa, et al.
Published: (2025)
by: Chegini, Atoosa, et al.
Published: (2025)
Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
by: Kim, Minseo, et al.
Published: (2025)
by: Kim, Minseo, et al.
Published: (2025)
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)
by: Samragh, Mohammad, et al.
Published: (2024)
MAVEN: Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing
by: Yao, Yinsheng, et al.
Published: (2026)
by: Yao, Yinsheng, et al.
Published: (2026)
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
by: Alizadeh, Keivan, et al.
Published: (2023)
by: Alizadeh, Keivan, et al.
Published: (2023)
What explains the success of cross-modal fine-tuning with ORCA?
by: García-de-Herreros, Paloma, et al.
Published: (2024)
by: García-de-Herreros, Paloma, et al.
Published: (2024)
Tiny-Toxic-Detector: A compact transformer-based model for toxic content detection
by: Kamphuis, Michiel
Published: (2024)
by: Kamphuis, Michiel
Published: (2024)
ProdRev: A DNN framework for empowering customers using generative pre-trained transformers
by: Gupta, Aakash, et al.
Published: (2025)
by: Gupta, Aakash, et al.
Published: (2025)
Do different prompting methods yield a common task representation in language models?
by: Davidson, Guy, et al.
Published: (2025)
by: Davidson, Guy, et al.
Published: (2025)
Crystal-KV: Efficient KV Cache Management for Chain-of-Thought LLMs via Answer-First Principle
by: Wang, Zihan, et al.
Published: (2026)
by: Wang, Zihan, et al.
Published: (2026)
Zero-shot data citation function classification using transformer-based large language models (LLMs)
by: Byers, Neil, et al.
Published: (2025)
by: Byers, Neil, et al.
Published: (2025)
Do Large Language Models Reason Causally Like Us? Even Better?
by: Dettki, Hanna M., et al.
Published: (2025)
by: Dettki, Hanna M., et al.
Published: (2025)
Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)
by: Maheswaran, Monishwaran, et al.
Published: (2025)
Neural networks for abstraction and reasoning: Towards broad generalization in machines
by: Bober-Irizar, Mikel, et al.
Published: (2024)
by: Bober-Irizar, Mikel, et al.
Published: (2024)
On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025)
by: Lampinen, Andrew K., et al.
Published: (2025)
Cartridges: Lightweight and general-purpose long context representations via self-study
by: Eyuboglu, Sabri, et al.
Published: (2025)
by: Eyuboglu, Sabri, et al.
Published: (2025)
Detecting mental disorder on social media: a ChatGPT-augmented explainable approach
by: Belcastro, Loris, et al.
Published: (2024)
by: Belcastro, Loris, et al.
Published: (2024)
Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
by: Wang, Ziyan, et al.
Published: (2025)
by: Wang, Ziyan, et al.
Published: (2025)
Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
by: Maisonnave, Lucas, et al.
Published: (2025)
by: Maisonnave, Lucas, et al.
Published: (2025)
MegaMath: Pushing the Limits of Open Math Corpora
by: Zhou, Fan, et al.
Published: (2025)
by: Zhou, Fan, et al.
Published: (2025)
Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
by: Pang, Jing-Cheng, et al.
Published: (2024)
by: Pang, Jing-Cheng, et al.
Published: (2024)
Similar Items
-
Rapid Word Learning Through Meta In-Context Learning
by: Wang, Wentao, et al.
Published: (2025) -
Are they human? Detecting large language models by probing human memory constraints
by: Schug, Simon, et al.
Published: (2026) -
Compositional learning of functions in humans and machines
by: Zhou, Yanli, et al.
Published: (2024) -
CoLLEGe: Concept Embedding Generation for Large Language Models
by: Teehan, Ryan, et al.
Published: (2024) -
Overcoming classic challenges for artificial neural networks by providing incentives and practice
by: Irie, Kazuki, et al.
Published: (2024)