Saved in:
| Main Authors: | Baveja, Gunbir Singh, Lewandowski, Alex, Schmidt, Mark |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.19698 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies
by: Baveja, Gunbir Singh
Published: (2025)
by: Baveja, Gunbir Singh
Published: (2025)
Impact of Financial Literacy on Investment Decisions and Stock Market Participation using Extreme Learning Machines
by: Baveja, Gunbir Singh, et al.
Published: (2024)
by: Baveja, Gunbir Singh, et al.
Published: (2024)
The Need for a Big World Simulator: A Scientific Challenge for Continual Learning
by: Kumar, Saurabh, et al.
Published: (2024)
by: Kumar, Saurabh, et al.
Published: (2024)
Directions of Curvature as an Explanation for Loss of Plasticity
by: Lewandowski, Alex, et al.
Published: (2023)
by: Lewandowski, Alex, et al.
Published: (2023)
SageBwd: A Trainable Low-bit Attention
by: Zhang, Jintao, et al.
Published: (2026)
by: Zhang, Jintao, et al.
Published: (2026)
Geometric Insights into Focal Loss: Reducing Curvature for Enhanced Model Calibration
by: Kimura, Masanari, et al.
Published: (2024)
by: Kimura, Masanari, et al.
Published: (2024)
A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMs
by: Kalra, Dayal Singh, et al.
Published: (2026)
by: Kalra, Dayal Singh, et al.
Published: (2026)
When Bias Meets Trainability: Connecting Theories of Initialization
by: Bassi, Alberto, et al.
Published: (2025)
by: Bassi, Alberto, et al.
Published: (2025)
SACn: Soft Actor-Critic with n-step Returns
by: Łyskawa, Jakub, et al.
Published: (2025)
by: Łyskawa, Jakub, et al.
Published: (2025)
Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets
by: Lapautre, Nicolas, et al.
Published: (2025)
by: Lapautre, Nicolas, et al.
Published: (2025)
Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
by: Fracastoro, Giulia, et al.
Published: (2025)
by: Fracastoro, Giulia, et al.
Published: (2025)
Scalable Decision Focused Learning via Online Trainable Surrogates
by: Signorelli, Gaetano, et al.
Published: (2025)
by: Signorelli, Gaetano, et al.
Published: (2025)
On the Trainability of Masked Diffusion Language Models via Blockwise Locality
by: Wang, Yuxiang, et al.
Published: (2026)
by: Wang, Yuxiang, et al.
Published: (2026)
Trainable Dynamic Mask Sparse Attention
by: Shi, Jingze, et al.
Published: (2025)
by: Shi, Jingze, et al.
Published: (2025)
Diffmv: A Unified Diffusion Framework for Healthcare Predictions with Random Missing Views and View Laziness
by: Zhao, Chuang, et al.
Published: (2025)
by: Zhao, Chuang, et al.
Published: (2025)
There is a Singularity in the Loss Landscape
by: Lowell, Mark
Published: (2022)
by: Lowell, Mark
Published: (2022)
A Unifying View of Coverage in Linear Off-Policy Evaluation
by: Amortila, Philip, et al.
Published: (2026)
by: Amortila, Philip, et al.
Published: (2026)
Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise
by: Sztukiewicz, Lukasz, et al.
Published: (2024)
by: Sztukiewicz, Lukasz, et al.
Published: (2024)
Trainable and Explainable Simplicial Map Neural Networks
by: Paluzo-Hidalgo, Eduardo, et al.
Published: (2023)
by: Paluzo-Hidalgo, Eduardo, et al.
Published: (2023)
Oversmoothing Alleviation in Graph Neural Networks: A Survey and Unified View
by: Jin, Yufei, et al.
Published: (2024)
by: Jin, Yufei, et al.
Published: (2024)
APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
by: Kumar, Ravin
Published: (2025)
by: Kumar, Ravin
Published: (2025)
Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks
by: Qi, Binchuan
Published: (2026)
by: Qi, Binchuan
Published: (2026)
RewriteNets: End-to-End Trainable String-Rewriting for Generative Sequence Modeling
by: Vejendla, Harshil
Published: (2026)
by: Vejendla, Harshil
Published: (2026)
Multi-Timescale Conductance Spiking Networks: A Sparse, Gradient-Trainable Framework with Rich Firing Dynamics for Enhanced Temporal Processing
by: Fulleda-Garcia, Alex, et al.
Published: (2026)
by: Fulleda-Garcia, Alex, et al.
Published: (2026)
Projected Compression: Trainable Projection for Efficient Transformer Compression
by: Stefaniak, Maciej, et al.
Published: (2025)
by: Stefaniak, Maciej, et al.
Published: (2025)
Mapping the Edge of Chaos: Fractal-Like Boundaries in The Trainability of Decoder-Only Transformer Models
by: Torkamandi, Bahman
Published: (2025)
by: Torkamandi, Bahman
Published: (2025)
Rethinking Loss Reweighting for Imbalance Learning as an Inverse Problem: A Neural Collapse Point of View
by: Wang, Jinping, et al.
Published: (2026)
by: Wang, Jinping, et al.
Published: (2026)
Weightless Neural Networks for Continuously Trainable Personalized Recommendation Systems
by: Latif, Rafayel, et al.
Published: (2025)
by: Latif, Rafayel, et al.
Published: (2025)
Multi-View Contrastive Learning for Robust Domain Adaptation in Medical Time Series Analysis
by: Oh, YongKyung, et al.
Published: (2025)
by: Oh, YongKyung, et al.
Published: (2025)
HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference
by: Gong, Ping, et al.
Published: (2025)
by: Gong, Ping, et al.
Published: (2025)
CantorNet: A Sandbox for Testing Geometrical and Topological Complexity Measures
by: Lewandowski, Michal, et al.
Published: (2024)
by: Lewandowski, Michal, et al.
Published: (2024)
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
by: Yuan, Jingyang, et al.
Published: (2025)
by: Yuan, Jingyang, et al.
Published: (2025)
MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
by: Lin, Bokai, et al.
Published: (2024)
by: Lin, Bokai, et al.
Published: (2024)
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
by: Wang, Bo, et al.
Published: (2025)
by: Wang, Bo, et al.
Published: (2025)
From Growing to Looping: A Unified View of Iterative Computation in LLMs
by: Kapl, Ferdinand, et al.
Published: (2026)
by: Kapl, Ferdinand, et al.
Published: (2026)
A Unifying Framework for Learning Argumentation Semantics
by: Mileva, Zlatina, et al.
Published: (2023)
by: Mileva, Zlatina, et al.
Published: (2023)
A Unified Definition of Hallucination: It's The World Model, Stupid!
by: Liu, Emmy, et al.
Published: (2025)
by: Liu, Emmy, et al.
Published: (2025)
When Policies Cannot Be Retrained: A Unified Closed-Form View of Post-Training Steering in Offline Reinforcement Learning
by: Hossain, Elias, et al.
Published: (2026)
by: Hossain, Elias, et al.
Published: (2026)
rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks
by: Jana, Suryasis, et al.
Published: (2026)
by: Jana, Suryasis, et al.
Published: (2026)
Tackling the Noisy Elephant in the Room: Label Noise-robust Out-of-Distribution Detection via Loss Correction and Low-rank Decomposition
by: Azad, Tarhib Al, et al.
Published: (2025)
by: Azad, Tarhib Al, et al.
Published: (2025)
Similar Items
-
Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies
by: Baveja, Gunbir Singh
Published: (2025) -
Impact of Financial Literacy on Investment Decisions and Stock Market Participation using Extreme Learning Machines
by: Baveja, Gunbir Singh, et al.
Published: (2024) -
The Need for a Big World Simulator: A Scientific Challenge for Continual Learning
by: Kumar, Saurabh, et al.
Published: (2024) -
Directions of Curvature as an Explanation for Loss of Plasticity
by: Lewandowski, Alex, et al.
Published: (2023) -
SageBwd: A Trainable Low-bit Attention
by: Zhang, Jintao, et al.
Published: (2026)