Saved in:
| Main Authors: | Das, Nataraj, Vedantam, Atreya, Lakshminarayanan, Chandrashekar |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.08302 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning to Price: Interpretable Attribute-Level Models for Dynamic Markets
by: Sethuraman, Srividhya, et al.
Published: (2026)
by: Sethuraman, Srividhya, et al.
Published: (2026)
Half-Space Feature Learning in Neural Networks
by: Yadav, Mahesh Lorik, et al.
Published: (2024)
by: Yadav, Mahesh Lorik, et al.
Published: (2024)
ProdRev: A DNN framework for empowering customers using generative pre-trained transformers
by: Gupta, Aakash, et al.
Published: (2025)
by: Gupta, Aakash, et al.
Published: (2025)
Topological Signatures of Grokking
by: Tang, Yifan, et al.
Published: (2026)
by: Tang, Yifan, et al.
Published: (2026)
RMLR: Extending Multinomial Logistic Regression into General Geometries
by: Chen, Ziheng, et al.
Published: (2024)
by: Chen, Ziheng, et al.
Published: (2024)
Grokking Explained: A Statistical Phenomenon
by: Carvalho, Breno W., et al.
Published: (2025)
by: Carvalho, Breno W., et al.
Published: (2025)
Controlling Grokking with Nonlinearity and Data Symmetry
by: Salah, Ahmed, et al.
Published: (2024)
by: Salah, Ahmed, et al.
Published: (2024)
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
by: Lee, Jaerin, et al.
Published: (2024)
by: Lee, Jaerin, et al.
Published: (2024)
Understanding Grokking Through A Robustness Viewpoint
by: Tan, Zhiquan, et al.
Published: (2023)
by: Tan, Zhiquan, et al.
Published: (2023)
Progress Measures for Grokking on Real-world Tasks
by: Golechha, Satvik
Published: (2024)
by: Golechha, Satvik
Published: (2024)
Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes
by: Mandal, Lakshmi, et al.
Published: (2023)
by: Mandal, Lakshmi, et al.
Published: (2023)
Grokking Finite-Dimensional Algebra
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2026)
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2026)
Grokking Group Multiplication with Cosets
by: Stander, Dashiell, et al.
Published: (2023)
by: Stander, Dashiell, et al.
Published: (2023)
Muon Optimizer Accelerates Grokking
by: Tveit, Amund, et al.
Published: (2025)
by: Tveit, Amund, et al.
Published: (2025)
Grokking Beyond the Euclidean Norm of Model Parameters
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2025)
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2025)
Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking
by: Xu, Yongzhong
Published: (2026)
by: Xu, Yongzhong
Published: (2026)
Tracing the Path to Grokking: Embeddings, Dropout, and Network Activation
by: Salah, Ahmed, et al.
Published: (2025)
by: Salah, Ahmed, et al.
Published: (2025)
The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold
by: Musat, Tiberiu
Published: (2025)
by: Musat, Tiberiu
Published: (2025)
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
by: Zhou, Xinyu, et al.
Published: (2025)
by: Zhou, Xinyu, et al.
Published: (2025)
Early-Warning Signals of Grokking via Loss-Landscape Geometry
by: Xu, Yongzhong
Published: (2026)
by: Xu, Yongzhong
Published: (2026)
Grokking as Structural Inference: Transformers Need Bayesian Lottery Tickets
by: Hidajat, Kai, et al.
Published: (2026)
by: Hidajat, Kai, et al.
Published: (2026)
Grokking and Generalization Collapse: Insights from \texttt{HTSR} theory
by: Prakash, Hari K., et al.
Published: (2025)
by: Prakash, Hari K., et al.
Published: (2025)
Sensitivity-Positional Co-Localization in GQA Transformers
by: Rao, Manoj Chandrashekar
Published: (2026)
by: Rao, Manoj Chandrashekar
Published: (2026)
Transitive RL: Value Learning via Divide and Conquer
by: Park, Seohong, et al.
Published: (2025)
by: Park, Seohong, et al.
Published: (2025)
Grokking as a Falsifiable Finite-Size Transition
by: Bi, Yuda, et al.
Published: (2026)
by: Bi, Yuda, et al.
Published: (2026)
Acceleration of Grokking in Learning Arithmetic Operations via Kolmogorov-Arnold Representation
by: Park, Yeachan, et al.
Published: (2024)
by: Park, Yeachan, et al.
Published: (2024)
Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking
by: Tian, Yuandong
Published: (2025)
by: Tian, Yuandong
Published: (2025)
Critical Data Size of Language Models from a Grokking Perspective
by: Zhu, Xuekai, et al.
Published: (2024)
by: Zhu, Xuekai, et al.
Published: (2024)
Grokking at the Edge of Numerical Stability
by: Prieto, Lucas, et al.
Published: (2025)
by: Prieto, Lucas, et al.
Published: (2025)
Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study
by: Morabia, Nataraj Agaram Sundar Tejas
Published: (2026)
by: Morabia, Nataraj Agaram Sundar Tejas
Published: (2026)
Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent
by: Chou, Chi-Ning, et al.
Published: (2026)
by: Chou, Chi-Ning, et al.
Published: (2026)
Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking
by: Xu, Yongzhong
Published: (2026)
by: Xu, Yongzhong
Published: (2026)
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
by: Lyu, Kaifeng, et al.
Published: (2023)
by: Lyu, Kaifeng, et al.
Published: (2023)
Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials
by: Furuta, Hiroki, et al.
Published: (2024)
by: Furuta, Hiroki, et al.
Published: (2024)
Auditable Unit-Aware Thresholds in Symbolic Regression via Logistic-Gated Operators
by: Deng, Ou, et al.
Published: (2025)
by: Deng, Ou, et al.
Published: (2025)
Transformers with Sparse Attention for Granger Causality
by: Mahesh, Riya, et al.
Published: (2024)
by: Mahesh, Riya, et al.
Published: (2024)
Optimizing Fintech Marketing: A Comparative Study of Logistic Regression and XGBoost
by: Attota, Sahar Yarmohammadtoosky Dinesh Chowdary
Published: (2024)
by: Attota, Sahar Yarmohammadtoosky Dinesh Chowdary
Published: (2024)
The Norm-Separation Delay Law of Grokking: A First-Principles Theory of Delayed Generalization
by: Khanh, Truong Xuan, et al.
Published: (2026)
by: Khanh, Truong Xuan, et al.
Published: (2026)
The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure
by: Xu, Yongzhong
Published: (2026)
by: Xu, Yongzhong
Published: (2026)
Grokking as a Variance-Limited Phase Transition: Spectral Gating and the Epsilon-Stability Threshold
by: Acharya, Pratyush, et al.
Published: (2026)
by: Acharya, Pratyush, et al.
Published: (2026)
Similar Items
-
Learning to Price: Interpretable Attribute-Level Models for Dynamic Markets
by: Sethuraman, Srividhya, et al.
Published: (2026) -
Half-Space Feature Learning in Neural Networks
by: Yadav, Mahesh Lorik, et al.
Published: (2024) -
ProdRev: A DNN framework for empowering customers using generative pre-trained transformers
by: Gupta, Aakash, et al.
Published: (2025) -
Topological Signatures of Grokking
by: Tang, Yifan, et al.
Published: (2026) -
RMLR: Extending Multinomial Logistic Regression into General Geometries
by: Chen, Ziheng, et al.
Published: (2024)