:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tang, Cheng, Lake, Brenden, Jazayeri, Mehrdad
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2502.15801
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Rapid Word Learning Through Meta In-Context Learning
by: Wang, Wentao, et al.
Published: (2025)

Are they human? Detecting large language models by probing human memory constraints
by: Schug, Simon, et al.
Published: (2026)

Compositional learning of functions in humans and machines
by: Zhou, Yanli, et al.
Published: (2024)

CoLLEGe: Concept Embedding Generation for Large Language Models
by: Teehan, Ryan, et al.
Published: (2024)

Overcoming classic challenges for artificial neural networks by providing incentives and practice
by: Irie, Kazuki, et al.
Published: (2024)

Aligned at the Start: Conceptual Groupings in LLM Embeddings
by: Khatir, Mehrdad, et al.
Published: (2024)

Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context
by: Alizadeh, Keivan, et al.
Published: (2026)

Detecting and explaining postpartum depression in real-time with generative artificial intelligence
by: García-Méndez, Silvia, et al.
Published: (2025)

A systematic investigation of learnability from single child linguistic input
by: Qin, Yulu, et al.
Published: (2024)

Scaling sparse feature circuit finding for in-context learning
by: Kharlapenko, Dmitrii, et al.
Published: (2025)

Out-of-distribution generalization via composition: a lens through induction heads in Transformers
by: Song, Jiajun, et al.
Published: (2024)

Illuminate: A novel approach for depression detection with explainable analysis and proactive therapy using prompt engineering
by: Agrawal, Aryan
Published: (2024)

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
by: Shojaee, Parshin, et al.
Published: (2025)

TIDE: Every Layer Knows the Token Beneath the Context
by: Jaiswal, Ajay, et al.
Published: (2026)

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?
by: Nunez, Jeanmely Rojas, et al.
Published: (2026)

When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)

Large Language Models in Cybersecurity: State-of-the-Art
by: Motlagh, Farzad Nourmohammadzadeh, et al.
Published: (2024)

Accelerated Portfolio Optimization and Option Pricing with Reinforcement Learning
by: Keramati, Hadi, et al.
Published: (2025)

SUS backprop: linear backpropagation algorithm for long inputs in transformers
by: Pankov, Sergey, et al.
Published: (2025)

Reasoning's Razor: Reasoning Improves Accuracy but Can Hurt Recall at Critical Operating Points in Safety and Hallucination Detection
by: Chegini, Atoosa, et al.
Published: (2025)

Beyond Next-Token Prediction: A Performance Characterization of Diffusion versus Autoregressive Language Models
by: Kim, Minseo, et al.
Published: (2025)

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization
by: Samragh, Mohammad, et al.
Published: (2024)

MAVEN: Multi-Agent Verification-Elaboration Network with In-Step Epistemic Auditing
by: Yao, Yinsheng, et al.
Published: (2026)

LLM in a flash: Efficient Large Language Model Inference with Limited Memory
by: Alizadeh, Keivan, et al.
Published: (2023)

What explains the success of cross-modal fine-tuning with ORCA?
by: García-de-Herreros, Paloma, et al.
Published: (2024)

Tiny-Toxic-Detector: A compact transformer-based model for toxic content detection
by: Kamphuis, Michiel
Published: (2024)

ProdRev: A DNN framework for empowering customers using generative pre-trained transformers
by: Gupta, Aakash, et al.
Published: (2025)

Do different prompting methods yield a common task representation in language models?
by: Davidson, Guy, et al.
Published: (2025)

Crystal-KV: Efficient KV Cache Management for Chain-of-Thought LLMs via Answer-First Principle
by: Wang, Zihan, et al.
Published: (2026)

Zero-shot data citation function classification using transformer-based large language models (LLMs)
by: Byers, Neil, et al.
Published: (2025)

Do Large Language Models Reason Causally Like Us? Even Better?
by: Dettki, Hanna M., et al.
Published: (2025)

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation
by: Maheswaran, Monishwaran, et al.
Published: (2025)

Neural networks for abstraction and reasoning: Towards broad generalization in machines
by: Bober-Irizar, Mikel, et al.
Published: (2024)

On the generalization of language models from in-context learning and finetuning: a controlled study
by: Lampinen, Andrew K., et al.
Published: (2025)

Cartridges: Lightweight and general-purpose long context representations via self-study
by: Eyuboglu, Sabri, et al.
Published: (2025)

Detecting mental disorder on social media: a ChatGPT-augmented explainable approach
by: Belcastro, Loris, et al.
Published: (2024)

Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning
by: Wang, Ziyan, et al.
Published: (2025)

Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs
by: Maisonnave, Lucas, et al.
Published: (2025)

MegaMath: Pushing the Limits of Open Math Corpora
by: Zhou, Fan, et al.
Published: (2025)

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
by: Pang, Jing-Cheng, et al.
Published: (2024)