Saved in:
| Main Authors: | Demir, Samet, Dogan, Zafer |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15152 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs
by: Demir, Samet, et al.
Published: (2025)
by: Demir, Samet, et al.
Published: (2025)
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
by: Demir, Samet, et al.
Published: (2025)
by: Demir, Samet, et al.
Published: (2025)
Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance
by: Demir, Samet, et al.
Published: (2024)
by: Demir, Samet, et al.
Published: (2024)
Optimal Attention Temperature Improves the Robustness of In-Context Learning under Distribution Shift in High Dimensions
by: Demir, Samet, et al.
Published: (2025)
by: Demir, Samet, et al.
Published: (2025)
Implicitly Normalized Online PCA: A Regularized Algorithm with Exact High-Dimensional Dynamics
by: Demir, Samet, et al.
Published: (2025)
by: Demir, Samet, et al.
Published: (2025)
Learning Beyond the Gaussian Data: Learning Dynamics of Neural Networks on an Expressive and Cumulant-Controllable Data Model
by: Ure, Onat, et al.
Published: (2026)
by: Ure, Onat, et al.
Published: (2026)
Benefits of Online Tilted Empirical Risk Minimization: A Case Study of Outlier Detection and Robust Regression
by: Yildirim, Yigit E., et al.
Published: (2025)
by: Yildirim, Yigit E., et al.
Published: (2025)
Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis
by: Gultekin, M. Oguzhan, et al.
Published: (2025)
by: Gultekin, M. Oguzhan, et al.
Published: (2025)
Learnability and Competition in High-Dimensional Multi-Component ICA
by: Genc, Eser Ilke, et al.
Published: (2026)
by: Genc, Eser Ilke, et al.
Published: (2026)
Exploring the Precise Dynamics of Single-Layer GAN Models: Leveraging Multi-Feature Discriminators for High-Dimensional Subspace Learning
by: Bond, Andrew, et al.
Published: (2024)
by: Bond, Andrew, et al.
Published: (2024)
Test-Time Training Provably Improves Transformers as In-context Learners
by: Gozeten, Halil Alperen, et al.
Published: (2025)
by: Gozeten, Halil Alperen, et al.
Published: (2025)
Gating is Weighting: Understanding Gated Linear Attention through In-context Learning
by: Li, Yingcong, et al.
Published: (2025)
by: Li, Yingcong, et al.
Published: (2025)
Latent Chain-of-Thought Improves Structured-Data Transformers
by: Dudley, Carson, et al.
Published: (2026)
by: Dudley, Carson, et al.
Published: (2026)
Negotiated Representations to Prevent Overfitting in Machine Learning Applications
by: Korhan, Nuri, et al.
Published: (2023)
by: Korhan, Nuri, et al.
Published: (2023)
Fine-grained Analysis of In-context Linear Estimation: Data, Architecture, and Beyond
by: Li, Yingcong, et al.
Published: (2024)
by: Li, Yingcong, et al.
Published: (2024)
Can Transformers Learn Optimal Filtering for Unknown Systems?
by: Balim, Haldun, et al.
Published: (2023)
by: Balim, Haldun, et al.
Published: (2023)
Selective Attention: Enhancing Transformer through Principled Context Control
by: Zhang, Xuechen, et al.
Published: (2024)
by: Zhang, Xuechen, et al.
Published: (2024)
Towards Generalized Hydrological Forecasting using Transformer Models for 120-Hour Streamflow Prediction
by: Demiray, Bekir Z., et al.
Published: (2024)
by: Demiray, Bekir Z., et al.
Published: (2024)
Covariance-Aware Transformers for Quadratic Programming and Decision Making
by: Tire, Kutay, et al.
Published: (2026)
by: Tire, Kutay, et al.
Published: (2026)
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
by: George, Anand Jerry, et al.
Published: (2026)
by: George, Anand Jerry, et al.
Published: (2026)
All Random Features Representations are Equivalent
by: Sernau, Luke, et al.
Published: (2024)
by: Sernau, Luke, et al.
Published: (2024)
Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix
by: Hayase, Tomohiro, et al.
Published: (2025)
by: Hayase, Tomohiro, et al.
Published: (2025)
On the Power of Convolution Augmented Transformer
by: Li, Mingchen, et al.
Published: (2024)
by: Li, Mingchen, et al.
Published: (2024)
Asymptotic theory of in-context learning by linear attention
by: Lu, Yue M., et al.
Published: (2024)
by: Lu, Yue M., et al.
Published: (2024)
Learning Randomized Algorithms with Transformers
by: von Oswald, Johannes, et al.
Published: (2024)
by: von Oswald, Johannes, et al.
Published: (2024)
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
by: Park, Jongho, et al.
Published: (2024)
by: Park, Jongho, et al.
Published: (2024)
Learning to Bet for Horizon-Aware Anytime-Valid Testing
by: Taga, Ege Onur, et al.
Published: (2026)
by: Taga, Ege Onur, et al.
Published: (2026)
Learning with Subset Stacking
by: Birbil, Ş. İlker, et al.
Published: (2021)
by: Birbil, Ş. İlker, et al.
Published: (2021)
Circuit Transformer: A Transformer That Preserves Logical Equivalence
by: Li, Xihan, et al.
Published: (2024)
by: Li, Xihan, et al.
Published: (2024)
Linking In-context Learning in Transformers to Human Episodic Memory
by: Ji-An, Li, et al.
Published: (2024)
by: Ji-An, Li, et al.
Published: (2024)
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
by: Ildiz, M. Emrullah, et al.
Published: (2024)
by: Ildiz, M. Emrullah, et al.
Published: (2024)
Breaking through the learning plateaus of in-context learning in Transformer
by: Fu, Jingwen, et al.
Published: (2023)
by: Fu, Jingwen, et al.
Published: (2023)
Equivalence of Context and Parameter Updates in Modern Transformer Blocks
by: Goldwaser, Adrian, et al.
Published: (2025)
by: Goldwaser, Adrian, et al.
Published: (2025)
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
by: Shams, Siavash, et al.
Published: (2024)
by: Shams, Siavash, et al.
Published: (2024)
Cross-Embodied Affordance Transfer through Learning Affordance Equivalences
by: Aktas, Hakan, et al.
Published: (2024)
by: Aktas, Hakan, et al.
Published: (2024)
Asymptotics of Learning with Deep Structured (Random) Features
by: Schröder, Dominik, et al.
Published: (2024)
by: Schröder, Dominik, et al.
Published: (2024)
Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens
by: Ren, Ruifeng, et al.
Published: (2023)
by: Ren, Ruifeng, et al.
Published: (2023)
Transformers as Support Vector Machines
by: Tarzanagh, Davoud Ataee, et al.
Published: (2023)
by: Tarzanagh, Davoud Ataee, et al.
Published: (2023)
Transformers are Universal In-context Learners
by: Furuya, Takashi, et al.
Published: (2024)
by: Furuya, Takashi, et al.
Published: (2024)
In-Context Learning Under Regime Change
by: Dudley, Carson, et al.
Published: (2026)
by: Dudley, Carson, et al.
Published: (2026)
Similar Items
-
How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs
by: Demir, Samet, et al.
Published: (2025) -
Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure
by: Demir, Samet, et al.
Published: (2025) -
Input-Label Correlation Governs a Linear-to-Nonlinear Transition in Random Features under Spiked Covariance
by: Demir, Samet, et al.
Published: (2024) -
Optimal Attention Temperature Improves the Robustness of In-Context Learning under Distribution Shift in High Dimensions
by: Demir, Samet, et al.
Published: (2025) -
Implicitly Normalized Online PCA: A Regularized Algorithm with Exact High-Dimensional Dynamics
by: Demir, Samet, et al.
Published: (2025)