:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Baveja, Gunbir Singh, Lewandowski, Alex, Schmidt, Mark
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.19698
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exploration and Adaptation in Non-Stationary Tasks with Diffusion Policies
by: Baveja, Gunbir Singh
Published: (2025)

Impact of Financial Literacy on Investment Decisions and Stock Market Participation using Extreme Learning Machines
by: Baveja, Gunbir Singh, et al.
Published: (2024)

The Need for a Big World Simulator: A Scientific Challenge for Continual Learning
by: Kumar, Saurabh, et al.
Published: (2024)

Directions of Curvature as an Explanation for Loss of Plasticity
by: Lewandowski, Alex, et al.
Published: (2023)

SageBwd: A Trainable Low-bit Attention
by: Zhang, Jintao, et al.
Published: (2026)

Geometric Insights into Focal Loss: Reducing Curvature for Enhanced Model Calibration
by: Kimura, Masanari, et al.
Published: (2024)

A Scalable Measure of Loss Landscape Curvature for Analyzing the Training Dynamics of LLMs
by: Kalra, Dayal Singh, et al.
Published: (2026)

When Bias Meets Trainability: Connecting Theories of Initialization
by: Bassi, Alberto, et al.
Published: (2025)

SACn: Soft Actor-Critic with n-step Returns
by: Łyskawa, Jakub, et al.
Published: (2025)

Natively Trainable Sparse Attention for Hierarchical Point Cloud Datasets
by: Lapautre, Nicolas, et al.
Published: (2025)

Playing the Lottery With Concave Regularizers for Sparse Trainable Neural Networks
by: Fracastoro, Giulia, et al.
Published: (2025)

Scalable Decision Focused Learning via Online Trainable Surrogates
by: Signorelli, Gaetano, et al.
Published: (2025)

On the Trainability of Masked Diffusion Language Models via Blockwise Locality
by: Wang, Yuxiang, et al.
Published: (2026)

Trainable Dynamic Mask Sparse Attention
by: Shi, Jingze, et al.
Published: (2025)

Diffmv: A Unified Diffusion Framework for Healthcare Predictions with Random Missing Views and View Laziness
by: Zhao, Chuang, et al.
Published: (2025)

There is a Singularity in the Loss Landscape
by: Lowell, Mark
Published: (2022)

A Unifying View of Coverage in Linear Off-Policy Evaluation
by: Amortila, Philip, et al.
Published: (2026)

Exploring Loss Design Techniques For Decision Tree Robustness To Label Noise
by: Sztukiewicz, Lukasz, et al.
Published: (2024)

Trainable and Explainable Simplicial Map Neural Networks
by: Paluzo-Hidalgo, Eduardo, et al.
Published: (2023)

Oversmoothing Alleviation in Graph Neural Networks: A Survey and Unified View
by: Jin, Yufei, et al.
Published: (2024)

APTx Neuron: A Unified Trainable Neuron Architecture Integrating Activation and Computation
by: Kumar, Ravin
Published: (2025)

Conjugate Learning Theory: Uncovering the Mechanisms of Trainability and Generalization in Deep Neural Networks
by: Qi, Binchuan
Published: (2026)

RewriteNets: End-to-End Trainable String-Rewriting for Generative Sequence Modeling
by: Vejendla, Harshil
Published: (2026)

Multi-Timescale Conductance Spiking Networks: A Sparse, Gradient-Trainable Framework with Rich Firing Dynamics for Enhanced Temporal Processing
by: Fulleda-Garcia, Alex, et al.
Published: (2026)

Projected Compression: Trainable Projection for Efficient Transformer Compression
by: Stefaniak, Maciej, et al.
Published: (2025)

Mapping the Edge of Chaos: Fractal-Like Boundaries in The Trainability of Decoder-Only Transformer Models
by: Torkamandi, Bahman
Published: (2025)

Rethinking Loss Reweighting for Imbalance Learning as an Inverse Problem: A Neural Collapse Point of View
by: Wang, Jinping, et al.
Published: (2026)

Weightless Neural Networks for Continuously Trainable Personalized Recommendation Systems
by: Latif, Rafayel, et al.
Published: (2025)

Multi-View Contrastive Learning for Robust Domain Adaptation in Medical Time Series Analysis
by: Oh, YongKyung, et al.
Published: (2025)

HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference
by: Gong, Ping, et al.
Published: (2025)

CantorNet: A Sandbox for Testing Geometrical and Topological Complexity Measures
by: Lewandowski, Michal, et al.
Published: (2024)

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
by: Yuan, Jingyang, et al.
Published: (2025)

MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection
by: Lin, Bokai, et al.
Published: (2024)

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
by: Wang, Bo, et al.
Published: (2025)

From Growing to Looping: A Unified View of Iterative Computation in LLMs
by: Kapl, Ferdinand, et al.
Published: (2026)

A Unifying Framework for Learning Argumentation Semantics
by: Mileva, Zlatina, et al.
Published: (2023)

A Unified Definition of Hallucination: It's The World Model, Stupid!
by: Liu, Emmy, et al.
Published: (2025)

When Policies Cannot Be Retrained: A Unified Closed-Form View of Post-Training Steering in Offline Reinforcement Learning
by: Hossain, Elias, et al.
Published: (2026)

rSDNet: Unified Robust Neural Learning against Label Noise and Adversarial Attacks
by: Jana, Suryasis, et al.
Published: (2026)

Tackling the Noisy Elephant in the Room: Label Noise-robust Out-of-Distribution Detection via Loss Correction and Low-rank Decomposition
by: Azad, Tarhib Al, et al.
Published: (2025)