:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Das, Nataraj, Vedantam, Atreya, Lakshminarayanan, Chandrashekar
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.08302
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning to Price: Interpretable Attribute-Level Models for Dynamic Markets
by: Sethuraman, Srividhya, et al.
Published: (2026)

Half-Space Feature Learning in Neural Networks
by: Yadav, Mahesh Lorik, et al.
Published: (2024)

ProdRev: A DNN framework for empowering customers using generative pre-trained transformers
by: Gupta, Aakash, et al.
Published: (2025)

Topological Signatures of Grokking
by: Tang, Yifan, et al.
Published: (2026)

RMLR: Extending Multinomial Logistic Regression into General Geometries
by: Chen, Ziheng, et al.
Published: (2024)

Grokking Explained: A Statistical Phenomenon
by: Carvalho, Breno W., et al.
Published: (2025)

Controlling Grokking with Nonlinearity and Data Symmetry
by: Salah, Ahmed, et al.
Published: (2024)

Grokfast: Accelerated Grokking by Amplifying Slow Gradients
by: Lee, Jaerin, et al.
Published: (2024)

Understanding Grokking Through A Robustness Viewpoint
by: Tan, Zhiquan, et al.
Published: (2023)

Progress Measures for Grokking on Real-world Tasks
by: Golechha, Satvik
Published: (2024)

Approximate Linear Programming for Decentralized Policy Iteration in Cooperative Multi-agent Markov Decision Processes
by: Mandal, Lakshmi, et al.
Published: (2023)

Grokking Finite-Dimensional Algebra
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2026)

Grokking Group Multiplication with Cosets
by: Stander, Dashiell, et al.
Published: (2023)

Muon Optimizer Accelerates Grokking
by: Tveit, Amund, et al.
Published: (2025)

Grokking Beyond the Euclidean Norm of Model Parameters
by: Notsawo, Pascal Jr Tikeng, et al.
Published: (2025)

Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking
by: Xu, Yongzhong
Published: (2026)

Tracing the Path to Grokking: Embeddings, Dropout, and Network Activation
by: Salah, Ahmed, et al.
Published: (2025)

The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold
by: Musat, Tiberiu
Published: (2025)

NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
by: Zhou, Xinyu, et al.
Published: (2025)

Early-Warning Signals of Grokking via Loss-Landscape Geometry
by: Xu, Yongzhong
Published: (2026)

Grokking as Structural Inference: Transformers Need Bayesian Lottery Tickets
by: Hidajat, Kai, et al.
Published: (2026)

Grokking and Generalization Collapse: Insights from \texttt{HTSR} theory
by: Prakash, Hari K., et al.
Published: (2025)

Sensitivity-Positional Co-Localization in GQA Transformers
by: Rao, Manoj Chandrashekar
Published: (2026)

Transitive RL: Value Learning via Divide and Conquer
by: Park, Seohong, et al.
Published: (2025)

Grokking as a Falsifiable Finite-Size Transition
by: Bi, Yuda, et al.
Published: (2026)

Acceleration of Grokking in Learning Arithmetic Operations via Kolmogorov-Arnold Representation
by: Park, Yeachan, et al.
Published: (2024)

Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking
by: Tian, Yuandong
Published: (2025)

Critical Data Size of Language Models from a Grokking Perspective
by: Zhu, Xuekai, et al.
Published: (2024)

Grokking at the Edge of Numerical Stability
by: Prieto, Lucas, et al.
Published: (2025)

Hierarchical Online Prompt Mutation with Dual-Loop Feedback for Guardrailed Evidence Document Generation: A Production-Evaluation Case Study
by: Morabia, Nataraj Agaram Sundar Tejas
Published: (2026)

Two Speeds of Learning: A Representation-Readout Decomposition of Grokking and Double Descent
by: Chou, Chi-Ning, et al.
Published: (2026)

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking
by: Xu, Yongzhong
Published: (2026)

Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
by: Lyu, Kaifeng, et al.
Published: (2023)

Towards Empirical Interpretation of Internal Circuits and Properties in Grokked Transformers on Modular Polynomials
by: Furuta, Hiroki, et al.
Published: (2024)

Auditable Unit-Aware Thresholds in Symbolic Regression via Logistic-Gated Operators
by: Deng, Ou, et al.
Published: (2025)

Transformers with Sparse Attention for Granger Causality
by: Mahesh, Riya, et al.
Published: (2024)

Optimizing Fintech Marketing: A Comparative Study of Logistic Regression and XGBoost
by: Attota, Sahar Yarmohammadtoosky Dinesh Chowdary
Published: (2024)

The Norm-Separation Delay Law of Grokking: A First-Principles Theory of Delayed Generalization
by: Khanh, Truong Xuan, et al.
Published: (2026)

The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure
by: Xu, Yongzhong
Published: (2026)

Grokking as a Variance-Limited Phase Transition: Spectral Gating and the Epsilon-Stability Threshold
by: Acharya, Pratyush, et al.
Published: (2026)