Saved in:
| Main Authors: | Lin, Ziqian, Bharti, Shubham Kumar, Lee, Kangwook |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.19787 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Dual Operating Modes of In-Context Learning
by: Lin, Ziqian, et al.
Published: (2024)
by: Lin, Ziqian, et al.
Published: (2024)
Task Vectors in In-Context Learning: Emergence, Formation, and Benefit
by: Yang, Liu, et al.
Published: (2025)
by: Yang, Liu, et al.
Published: (2025)
Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models
by: Lee, Chungpa, et al.
Published: (2026)
by: Lee, Chungpa, et al.
Published: (2026)
Can MLLMs Perform Text-to-Image In-Context Learning?
by: Zeng, Yuchen, et al.
Published: (2024)
by: Zeng, Yuchen, et al.
Published: (2024)
Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback
by: Kim, Jungtaek, et al.
Published: (2026)
by: Kim, Jungtaek, et al.
Published: (2026)
Optimizing DDPM Sampling with Shortcut Fine-Tuning
by: Fan, Ying, et al.
Published: (2023)
by: Fan, Ying, et al.
Published: (2023)
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
by: Park, Jongho, et al.
Published: (2024)
by: Park, Jongho, et al.
Published: (2024)
The Expressive Power of Low-Rank Adaptation
by: Zeng, Yuchen, et al.
Published: (2023)
by: Zeng, Yuchen, et al.
Published: (2023)
Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition
by: Xiong, Zheyang, et al.
Published: (2024)
by: Xiong, Zheyang, et al.
Published: (2024)
Looped Transformers are Better at Learning Learning Algorithms
by: Yang, Liu, et al.
Published: (2023)
by: Yang, Liu, et al.
Published: (2023)
Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression
by: Shenouda, Joseph, et al.
Published: (2023)
by: Shenouda, Joseph, et al.
Published: (2023)
Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance
by: Park, Dongmin, et al.
Published: (2024)
by: Park, Dongmin, et al.
Published: (2024)
Looped Transformers for Length Generalization
by: Fan, Ying, et al.
Published: (2024)
by: Fan, Ying, et al.
Published: (2024)
Memorization Capacity for Additive Fine-Tuning with Small ReLU Networks
by: Sohn, Jy-yong, et al.
Published: (2024)
by: Sohn, Jy-yong, et al.
Published: (2024)
A Novel Data-Dependent Learning Paradigm for Large Hypothesis Classes
by: Pour, Alireza F., et al.
Published: (2025)
by: Pour, Alireza F., et al.
Published: (2025)
DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental Learning
by: Zhuang, Huiping, et al.
Published: (2024)
by: Zhuang, Huiping, et al.
Published: (2024)
ReJump: A Tree-Jump Representation for Analyzing and Improving LLM Reasoning
by: Zeng, Yuchen, et al.
Published: (2025)
by: Zeng, Yuchen, et al.
Published: (2025)
Leave-One-Out Prediction for General Hypothesis Classes
by: Qian, Jian, et al.
Published: (2026)
by: Qian, Jian, et al.
Published: (2026)
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
by: Lee, Nayoung, et al.
Published: (2025)
by: Lee, Nayoung, et al.
Published: (2025)
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
by: Yang, Seongjun, et al.
Published: (2023)
by: Yang, Seongjun, et al.
Published: (2023)
Smoothness Adaptive Hypothesis Transfer Learning
by: Lin, Haotian, et al.
Published: (2024)
by: Lin, Haotian, et al.
Published: (2024)
On Hypothesis Transfer Learning of Functional Linear Models
by: Lin, Haotian, et al.
Published: (2022)
by: Lin, Haotian, et al.
Published: (2022)
Infected Smallville: How Disease Threat Shapes Sociality in LLM Agents
by: Choi, Soyeon, et al.
Published: (2025)
by: Choi, Soyeon, et al.
Published: (2025)
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data
by: Xiong, Zheyang, et al.
Published: (2024)
by: Xiong, Zheyang, et al.
Published: (2024)
What Do Language Models Learn in Context? The Structured Task Hypothesis
by: Li, Jiaoda, et al.
Published: (2024)
by: Li, Jiaoda, et al.
Published: (2024)
ENTP: Encoder-only Next Token Prediction
by: Ewer, Ethan, et al.
Published: (2024)
by: Ewer, Ethan, et al.
Published: (2024)
How to Correctly Report LLM-as-a-Judge Evaluations
by: Lee, Chungpa, et al.
Published: (2025)
by: Lee, Chungpa, et al.
Published: (2025)
Parameter-Efficient Fine-Tuning of State Space Models
by: Galim, Kevin, et al.
Published: (2024)
by: Galim, Kevin, et al.
Published: (2024)
Multi-Bin Batching for Increasing LLM Inference Throughput
by: Guldogan, Ozgur, et al.
Published: (2024)
by: Guldogan, Ozgur, et al.
Published: (2024)
Automated Type Annotation in Python Using Large Language Models
by: Bharti, Varun, et al.
Published: (2025)
by: Bharti, Varun, et al.
Published: (2025)
Hypothesis Class Determines Explanation: Why Accurate Models Disagree on Feature Attribution
by: B, Thackshanaramana
Published: (2026)
by: B, Thackshanaramana
Published: (2026)
Bayesian Active Learning in the Presence of Nuisance Parameters
by: Sloman, Sabina J., et al.
Published: (2023)
by: Sloman, Sabina J., et al.
Published: (2023)
Muon with Spectral Guidance: Efficient Optimization for Scientific Machine Learning
by: Lu, Binghang, et al.
Published: (2026)
by: Lu, Binghang, et al.
Published: (2026)
HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under Saturation
by: Singh, Shubham Kumar
Published: (2026)
by: Singh, Shubham Kumar
Published: (2026)
Rethinking Attention Output Projection: Structured Hadamard Transforms for Efficient Transformers
by: Aggarwal, Shubham, et al.
Published: (2026)
by: Aggarwal, Shubham, et al.
Published: (2026)
MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning
by: Zhang, Tao, et al.
Published: (2025)
by: Zhang, Tao, et al.
Published: (2025)
Quantization vs Pruning: Insights from the Strong Lottery Ticket Hypothesis
by: Kumar, Aakash, et al.
Published: (2025)
by: Kumar, Aakash, et al.
Published: (2025)
Hypothesis Spaces for Deep Learning
by: Wang, Rui, et al.
Published: (2024)
by: Wang, Rui, et al.
Published: (2024)
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
by: Kim, Sungnyun, et al.
Published: (2025)
by: Kim, Sungnyun, et al.
Published: (2025)
LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation
by: Ahn, Jinwoo, et al.
Published: (2026)
by: Ahn, Jinwoo, et al.
Published: (2026)
Similar Items
-
Dual Operating Modes of In-Context Learning
by: Lin, Ziqian, et al.
Published: (2024) -
Task Vectors in In-Context Learning: Emergence, Formation, and Benefit
by: Yang, Liu, et al.
Published: (2025) -
Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models
by: Lee, Chungpa, et al.
Published: (2026) -
Can MLLMs Perform Text-to-Image In-Context Learning?
by: Zeng, Yuchen, et al.
Published: (2024) -
Transformers in the Dark: Navigating Unknown Search Spaces via Bandit Feedback
by: Kim, Jungtaek, et al.
Published: (2026)