:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Duvvuri, Sai Surya, Patel, Nirmal, Gupta, Nilesh, Dhillon, Inderjit S.
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.10410
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LASER: Attention with Exponential Transformation
by: Duvvuri, Sai Surya, et al.
Published: (2024)

ODRPO: Ordinal Decompositions of Discrete Rewards for Robust Policy Optimization
by: Patel, Nirmal, et al.
Published: (2026)

Interleaved Head Attention
by: Duvvuri, Sai Surya, et al.
Published: (2026)

Towards Quantifying the Preconditioning Effect of Adam
by: Das, Rudrajit, et al.
Published: (2024)

The Art of Scaling Reinforcement Learning Compute for LLMs
by: Khatri, Devvrit, et al.
Published: (2025)

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
by: Yen, Jui-Nan, et al.
Published: (2024)

Dual-Encoders for Extreme Multi-Label Classification
by: Gupta, Nilesh, et al.
Published: (2023)

LLM-guided Hierarchical Search for End-to-end Reasoning Intensive Retrieval
by: Gupta, Nilesh, et al.
Published: (2025)

EHI: End-to-end Learning of Hierarchical Index for Efficient Dense Retrieval
by: Kumar, Ramnath, et al.
Published: (2023)

Scalable In-context Ranking with Generative Models
by: Gupta, Nilesh, et al.
Published: (2025)

Geometric Median (GM) Matching for Robust Data Pruning
by: Acharya, Anish, et al.
Published: (2024)

Fast and Simplex: 2-Simplicial Attention in Triton
by: Roy, Aurko, et al.
Published: (2025)

Geometric Median Matching for Robust k-Subset Selection from Noisy Data
by: Acharya, Anish, et al.
Published: (2025)

Understanding Contrastive Representation Learning from Positive Unlabeled (PU) Data
by: Acharya, Anish, et al.
Published: (2024)

Compressing Many-Shots in In-Context Learning
by: Khatri, Devvrit, et al.
Published: (2025)

LUCID: Learning-Enabled Uncertainty-Aware Certification of Stochastic Dynamical Systems
by: Casablanca, Ernesto, et al.
Published: (2025)

Preconditioned Attention: Enhancing Efficiency in Transformers
by: Saratchandran, Hemanth
Published: (2026)

Positive Unlabeled Contrastive Learning
by: Acharya, Anish, et al.
Published: (2022)

Retraining with Predicted Hard Labels Provably Increases Model Accuracy
by: Das, Rudrajit, et al.
Published: (2024)

Two-stage LLM Fine-tuning with Less Specialization and More Generalization
by: Wang, Yihan, et al.
Published: (2022)

Attention Meets UAVs: A Comprehensive Evaluation of DDoS Detection in Low-Cost UAVs
by: Sharma, Ashish, et al.
Published: (2024)

Large Language Models are Interpretable Learners
by: Wang, Ruochen, et al.
Published: (2024)

Multi-Head Attention Is a Multi-Player Game
by: Chakrabarti, Kushal, et al.
Published: (2026)

Let's (not) just put things in Context: Test-Time Training for Long-Context LLMs
by: Bansal, Rachit, et al.
Published: (2025)

Training Dynamics of Softmax Self-Attention: Fast Global Convergence via Preconditioning
by: Goel, Gautam, et al.
Published: (2026)

Exploring Design Choices for Building Language-Specific LLMs
by: Tejaswi, Atula, et al.
Published: (2024)

OSDN: Improving Delta Rule with Provable Online Preconditioning in Linear Attention
by: Zhou, Chenyu, et al.
Published: (2026)

Universal Sequence Preconditioning
by: Marsden, Annie, et al.
Published: (2025)

Matryoshka Model Learning for Improved Elastic Student Models
by: Verma, Chetan, et al.
Published: (2025)

Paged Attention Meets FlexAttention: Unlocking Long-Context Efficiency in Deployed Inference
by: Joshi, Thomas, et al.
Published: (2025)

Multi-Knowledge Fusion Network for Time Series Representation Learning
by: Sakhinana, Sagar Srinivas, et al.
Published: (2024)

Are Anxiety Detection Models Generalizable? A Cross-Activity and Cross-Population Study Using Wearables
by: Sahu, Nilesh Kumar, et al.
Published: (2025)

Open-TQ-Metal: Fused Compressed-Domain Attention for Long-Context LLM Inference on Apple Silicon
by: Vegasena, Sai
Published: (2026)

Multi-Source Knowledge-Based Hybrid Neural Framework for Time Series Representation Learning
by: Sakhinana, Sagar Srinivas, et al.
Published: (2024)

On the Nystrom Approximation for Preconditioning in Kernel Machines
by: Abedsoltan, Amirhesam, et al.
Published: (2023)

A Representation-Consistent Gated Recurrent Framework for Robust Medical Time-Series Classification
by: Sai, Maitri Krishna
Published: (2026)

AnxietyFaceTrack: A Smartphone-Based Non-Intrusive Approach for Detecting Social Anxiety Using Facial Features
by: Sahu, Nilesh Kumar, et al.
Published: (2025)

MatFormer: Nested Transformer for Elastic Inference
by: Devvrit, et al.
Published: (2023)

The Power of Second Order Methods for Sequence Preconditioning
by: Marsden, Annie, et al.
Published: (2026)

Preconditioned Inexact Stochastic ADMM for Deep Model
by: Zhou, Shenglong, et al.
Published: (2025)