Saved in:
| Main Authors: | Ronen, Omer, Humayun, Ahmed Imtiaz, Baraniuk, Richard, Balestriero, Randall, Yu, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.09657 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Deep Networks Always Grok and Here is Why
by: Humayun, Ahmed Imtiaz, et al.
Published: (2024)
by: Humayun, Ahmed Imtiaz, et al.
Published: (2024)
On the Geometry of Deep Learning
by: Balestriero, Randall, et al.
Published: (2024)
by: Balestriero, Randall, et al.
Published: (2024)
The Linear Centroids Hypothesis: Features as Directions Learned by Local Experts
by: Walker, Thomas, et al.
Published: (2026)
by: Walker, Thomas, et al.
Published: (2026)
GrokAlign: Geometric Characterisation and Acceleration of Grokking
by: Walker, Thomas, et al.
Published: (2025)
by: Walker, Thomas, et al.
Published: (2025)
SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries
by: Humayun, Ahmed Imtiaz, et al.
Published: (2023)
by: Humayun, Ahmed Imtiaz, et al.
Published: (2023)
The Geometric Structure of Models Learning Sparse Data
by: Walker, Thomas, et al.
Published: (2026)
by: Walker, Thomas, et al.
Published: (2026)
Self-Improving Diffusion Models with Synthetic Data
by: Alemohammad, Sina, et al.
Published: (2024)
by: Alemohammad, Sina, et al.
Published: (2024)
ALLoRA: Adaptive Learning Rate Mitigates LoRA Fatal Flaws
by: Huang, Hai, et al.
Published: (2024)
by: Huang, Hai, et al.
Published: (2024)
SAFE: A Novel Approach to AI Weather Evaluation through Stratified Assessments of Forecasts over Earth
by: Masi, Nick, et al.
Published: (2025)
by: Masi, Nick, et al.
Published: (2025)
Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting
by: Dronen, Nicholas, et al.
Published: (2025)
by: Dronen, Nicholas, et al.
Published: (2025)
Task Priors: Enhancing Model Evaluation by Considering the Entire Space of Downstream Tasks
by: Patel, Niket, et al.
Published: (2025)
by: Patel, Niket, et al.
Published: (2025)
No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data
by: Cai, Daniel, et al.
Published: (2025)
by: Cai, Daniel, et al.
Published: (2025)
Fast and Exact Enumeration of Deep Networks Partitions Regions
by: Balestriero, Randall, et al.
Published: (2024)
by: Balestriero, Randall, et al.
Published: (2024)
Max-Affine Spline Insights Into Deep Network Pruning
by: You, Haoran, et al.
Published: (2021)
by: You, Haoran, et al.
Published: (2021)
Curvature Tuning: Provable Training-free Model Steering From a Single Parameter
by: Hu, Leyang, et al.
Published: (2025)
by: Hu, Leyang, et al.
Published: (2025)
Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning
by: Hsu, Chia-Hong, et al.
Published: (2026)
by: Hsu, Chia-Hong, et al.
Published: (2026)
Occam's Razor for Self Supervised Learning: What is Sufficient to Learn Good Representations?
by: Ibrahim, Mark, et al.
Published: (2024)
by: Ibrahim, Mark, et al.
Published: (2024)
Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA
by: Huang, Hai, et al.
Published: (2026)
by: Huang, Hai, et al.
Published: (2026)
Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations
by: Mialon, Grégoire, et al.
Published: (2022)
by: Mialon, Grégoire, et al.
Published: (2022)
Learning by Reconstruction Produces Uninformative Features For Perception
by: Balestriero, Randall, et al.
Published: (2024)
by: Balestriero, Randall, et al.
Published: (2024)
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics
by: Balestriero, Randall, et al.
Published: (2025)
by: Balestriero, Randall, et al.
Published: (2025)
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
by: Jin, Tian, et al.
Published: (2025)
by: Jin, Tian, et al.
Published: (2025)
The Fair Language Model Paradox
by: Pinto, Andrea, et al.
Published: (2024)
by: Pinto, Andrea, et al.
Published: (2024)
Characterizing Large Language Model Geometry Helps Solve Toxicity Detection and Generation
by: Balestriero, Randall, et al.
Published: (2023)
by: Balestriero, Randall, et al.
Published: (2023)
CBOL-Tuner: Classifier-pruned Bayesian optimization to explore temporally structured latent spaces for particle accelerator tuning
by: Rautela, Mahindra, et al.
Published: (2024)
by: Rautela, Mahindra, et al.
Published: (2024)
Self-Supervised Anomaly Detection in the Wild: Favor Joint Embeddings Methods
by: Otero, Daniel, et al.
Published: (2024)
by: Otero, Daniel, et al.
Published: (2024)
Position: An Empirically Grounded Identifiability Theory Will Accelerate Self-Supervised Learning Research
by: Reizinger, Patrik, et al.
Published: (2025)
by: Reizinger, Patrik, et al.
Published: (2025)
Your Attention Matters: to Improve Model Robustness to Noise and Spurious Correlations
by: Tamayo-Rousseau, Camilo, et al.
Published: (2025)
by: Tamayo-Rousseau, Camilo, et al.
Published: (2025)
Is your algorithm unlearning or untraining?
by: Triantafillou, Eleni, et al.
Published: (2026)
by: Triantafillou, Eleni, et al.
Published: (2026)
Improving Fairness and Mitigating MADness in Generative Models
by: Mayer, Paul, et al.
Published: (2024)
by: Mayer, Paul, et al.
Published: (2024)
PrAg-PO: Prompt Augmented Policy Optimization for Robust and Diverse Mathematical Reasoning
by: Lu, Wenquan, et al.
Published: (2026)
by: Lu, Wenquan, et al.
Published: (2026)
On the Computational Efficiency of Bayesian Additive Regression Trees: An Asymptotic Analysis
by: Tan, Yan Shuo, et al.
Published: (2024)
by: Tan, Yan Shuo, et al.
Published: (2024)
Approaching an unknown communication system by latent space exploration and causal inference
by: Beguš, Gašper, et al.
Published: (2023)
by: Beguš, Gašper, et al.
Published: (2023)
Double Descent and Other Interpolation Phenomena in GANs
by: Luzi, Lorenzo, et al.
Published: (2021)
by: Luzi, Lorenzo, et al.
Published: (2021)
PCS-UQ: Uncertainty Quantification via the Predictability-Computability-Stability Framework
by: Agarwal, Abhineet, et al.
Published: (2025)
by: Agarwal, Abhineet, et al.
Published: (2025)
stable-pretraining-v1: Foundation Model Research Made Simple
by: Balestriero, Randall, et al.
Published: (2025)
by: Balestriero, Randall, et al.
Published: (2025)
Synthetic Context Generation for Question Generation
by: Liu, Naiming, et al.
Published: (2024)
by: Liu, Naiming, et al.
Published: (2024)
Circuit Complexity of Hierarchical Knowledge Tracing and Implications for Log-Precision Transformers
by: Liu, Naiming, et al.
Published: (2026)
by: Liu, Naiming, et al.
Published: (2026)
Stabilizing autoregressive forecasts in chaotic systems via multi-rate latent recurrence
by: Dhingra, Mrigank, et al.
Published: (2026)
by: Dhingra, Mrigank, et al.
Published: (2026)
GPS-SSL: Guided Positive Sampling to Inject Prior Into Self-Supervised Learning
by: Feizi, Aarash, et al.
Published: (2024)
by: Feizi, Aarash, et al.
Published: (2024)
Similar Items
-
Deep Networks Always Grok and Here is Why
by: Humayun, Ahmed Imtiaz, et al.
Published: (2024) -
On the Geometry of Deep Learning
by: Balestriero, Randall, et al.
Published: (2024) -
The Linear Centroids Hypothesis: Features as Directions Learned by Local Experts
by: Walker, Thomas, et al.
Published: (2026) -
GrokAlign: Geometric Characterisation and Acceleration of Grokking
by: Walker, Thomas, et al.
Published: (2025) -
SplineCam: Exact Visualization and Characterization of Deep Network Geometry and Decision Boundaries
by: Humayun, Ahmed Imtiaz, et al.
Published: (2023)