Saved in:
| Main Authors: | Yau, Morris, Gupta, Sharut, Engelmayer, Valerie, Irie, Kazuki, Jegelka, Stefanie, Andreas, Jacob |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.10918 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond
by: Soma, Tasuku, et al.
Published: (2022)
by: Soma, Tasuku, et al.
Published: (2022)
Canonicalizing Multimodal Contrastive Representation Learning
by: Gupta, Sharut, et al.
Published: (2026)
by: Gupta, Sharut, et al.
Published: (2026)
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
by: Gupta, Sharut, et al.
Published: (2025)
by: Gupta, Sharut, et al.
Published: (2025)
In-Context Symmetries: Self-Supervised Learning through Contextual World Models
by: Gupta, Sharut, et al.
Published: (2024)
by: Gupta, Sharut, et al.
Published: (2024)
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
by: Irie, Kazuki, et al.
Published: (2025)
by: Irie, Kazuki, et al.
Published: (2025)
Parallel Algorithms Align with Neural Execution
by: Engelmayer, Valerie, et al.
Published: (2023)
by: Engelmayer, Valerie, et al.
Published: (2023)
Learning Diffusion Models with Flexible Representation Guidance
by: Wang, Chenyu, et al.
Published: (2025)
by: Wang, Chenyu, et al.
Published: (2025)
Understanding the Role of Equivariance in Self-supervised Learning
by: Wang, Yifei, et al.
Published: (2024)
by: Wang, Yifei, et al.
Published: (2024)
Learning Linear Attention in Polynomial Time
by: Yau, Morris, et al.
Published: (2024)
by: Yau, Morris, et al.
Published: (2024)
An Information Criterion for Controlled Disentanglement of Multimodal Data
by: Wang, Chenyu, et al.
Published: (2024)
by: Wang, Chenyu, et al.
Published: (2024)
ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
by: Gupta, Sharut, et al.
Published: (2026)
by: Gupta, Sharut, et al.
Published: (2026)
Are Graph Neural Networks Optimal Approximation Algorithms?
by: Yau, Morris, et al.
Published: (2023)
by: Yau, Morris, et al.
Published: (2023)
Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? Revisiting a Petroglyph
by: Irie, Kazuki
Published: (2024)
by: Irie, Kazuki
Published: (2024)
Survey on Generalization Theory for Graph Neural Networks
by: Vasileiou, Antonis, et al.
Published: (2025)
by: Vasileiou, Antonis, et al.
Published: (2025)
Higher-Order Graphon Neural Networks: Approximation and Cut Distance
by: Herbst, Daniel, et al.
Published: (2025)
by: Herbst, Daniel, et al.
Published: (2025)
Sample Complexity Bounds for Estimating Probability Divergences under Invariances
by: Tahmasebi, Behrooz, et al.
Published: (2023)
by: Tahmasebi, Behrooz, et al.
Published: (2023)
The Exact Sample Complexity Gain from Invariances for Kernel Regression
by: Tahmasebi, Behrooz, et al.
Published: (2023)
by: Tahmasebi, Behrooz, et al.
Published: (2023)
Learning to Approximate Uniform Facility Location via Graph Neural Networks
by: Qian, Chendi, et al.
Published: (2026)
by: Qian, Chendi, et al.
Published: (2026)
Metalearning Continual Learning Algorithms
by: Irie, Kazuki, et al.
Published: (2023)
by: Irie, Kazuki, et al.
Published: (2023)
Exploring the Promise and Limits of Real-Time Recurrent Learning
by: Irie, Kazuki, et al.
Published: (2023)
by: Irie, Kazuki, et al.
Published: (2023)
Fast weight programming and linear transformers: from machine learning to neurobiology
by: Irie, Kazuki, et al.
Published: (2025)
by: Irie, Kazuki, et al.
Published: (2025)
Overcoming classic challenges for artificial neural networks by providing incentives and practice
by: Irie, Kazuki, et al.
Published: (2024)
by: Irie, Kazuki, et al.
Published: (2024)
A projection-based framework for gradient-free and parallel learning
by: Bergmeister, Andreas, et al.
Published: (2025)
by: Bergmeister, Andreas, et al.
Published: (2025)
Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs
by: Rauchwerger, Levi, et al.
Published: (2024)
by: Rauchwerger, Levi, et al.
Published: (2024)
Neural Networks With Dense Weights Are Not Universal Approximators
by: Rauchwerger, Levi, et al.
Published: (2026)
by: Rauchwerger, Levi, et al.
Published: (2026)
Counting Substructures with Higher-Order Graph Neural Networks: Possibility and Impossibility Results
by: Tahmasebi, Behrooz, et al.
Published: (2020)
by: Tahmasebi, Behrooz, et al.
Published: (2020)
A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs
by: Le, Thien, et al.
Published: (2023)
by: Le, Thien, et al.
Published: (2023)
Self-Organising Neural Discrete Representation Learning à la Kohonen
by: Irie, Kazuki, et al.
Published: (2023)
by: Irie, Kazuki, et al.
Published: (2023)
When Less is Enough: Efficient Inference via Collaborative Reasoning
by: Chen, Yilei, et al.
Published: (2026)
by: Chen, Yilei, et al.
Published: (2026)
Reinforce Adjoint Matching: Scaling RL Post-Training of Diffusion and Flow-Matching Models
by: Bergmeister, Andreas, et al.
Published: (2026)
by: Bergmeister, Andreas, et al.
Published: (2026)
Why Depth Matters in Parallelizable Sequence Models: A Lie Algebraic View
by: Heo, Gyuryang, et al.
Published: (2026)
by: Heo, Gyuryang, et al.
Published: (2026)
Key-value memory in the brain
by: Gershman, Samuel J., et al.
Published: (2025)
by: Gershman, Samuel J., et al.
Published: (2025)
PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning
by: Kawakami, Tatsuki, et al.
Published: (2025)
by: Kawakami, Tatsuki, et al.
Published: (2025)
On the Emergence of Position Bias in Transformers
by: Wu, Xinyi, et al.
Published: (2025)
by: Wu, Xinyi, et al.
Published: (2025)
Learning on LoRAs: GL-Equivariant Processing of Low-Rank Weight Spaces for Large Finetuned Models
by: Putterman, Theo, et al.
Published: (2024)
by: Putterman, Theo, et al.
Published: (2024)
LieAugmenter: Equivariant Learning by Discovering Symmetries with Learnable Augmentations
by: Santos-Escriche, Eduardo, et al.
Published: (2025)
by: Santos-Escriche, Eduardo, et al.
Published: (2025)
Learning with Exact Invariances in Polynomial Time
by: Soleymani, Ashkan, et al.
Published: (2025)
by: Soleymani, Ashkan, et al.
Published: (2025)
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
by: Csordás, Róbert, et al.
Published: (2023)
by: Csordás, Róbert, et al.
Published: (2023)
Bifurcated Attention: Accelerating Massively Parallel Decoding with Shared Prefixes in LLMs
by: Athiwaratkun, Ben, et al.
Published: (2024)
by: Athiwaratkun, Ben, et al.
Published: (2024)
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel
by: Gupta, Shivam, et al.
Published: (2024)
by: Gupta, Shivam, et al.
Published: (2024)
Similar Items
-
Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond
by: Soma, Tasuku, et al.
Published: (2022) -
Canonicalizing Multimodal Contrastive Representation Learning
by: Gupta, Sharut, et al.
Published: (2026) -
Better Together: Leveraging Unpaired Multimodal Data for Stronger Unimodal Models
by: Gupta, Sharut, et al.
Published: (2025) -
In-Context Symmetries: Self-Supervised Learning through Contextual World Models
by: Gupta, Sharut, et al.
Published: (2024) -
Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
by: Irie, Kazuki, et al.
Published: (2025)