Saved in:
| Main Authors: | Roy, Aurko, Chou, Timothy, Duvvuri, Sai Surya, Chen, Sijia, Yu, Jiecao, Wang, Xiaodong, Zaheer, Manzil, Anil, Rohan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.02754 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Art of Scaling Reinforcement Learning Compute for LLMs
by: Khatri, Devvrit, et al.
Published: (2025)
by: Khatri, Devvrit, et al.
Published: (2025)
Differentially Private Model Merging
by: Yin, Qichuan, et al.
Published: (2026)
by: Yin, Qichuan, et al.
Published: (2026)
A Statistical Framework for Data-dependent Retrieval-Augmented Models
by: Basu, Soumya, et al.
Published: (2024)
by: Basu, Soumya, et al.
Published: (2024)
Federation over Text: Insight Sharing for Multi-Agent Reasoning
by: Yao, Dixi, et al.
Published: (2026)
by: Yao, Dixi, et al.
Published: (2026)
Interleaved Head Attention
by: Duvvuri, Sai Surya, et al.
Published: (2026)
by: Duvvuri, Sai Surya, et al.
Published: (2026)
Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models
by: Huang, Yukun, et al.
Published: (2025)
by: Huang, Yukun, et al.
Published: (2025)
LASER: Attention with Exponential Transformation
by: Duvvuri, Sai Surya, et al.
Published: (2024)
by: Duvvuri, Sai Surya, et al.
Published: (2024)
Deep Reinforcement Learning for Sequential Combinatorial Auctions
by: Ravindranath, Sai Srivatsa, et al.
Published: (2024)
by: Ravindranath, Sai Srivatsa, et al.
Published: (2024)
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization
by: Yen, Jui-Nan, et al.
Published: (2024)
by: Yen, Jui-Nan, et al.
Published: (2024)
Accelerating Transformer Inference and Training with 2:4 Activation Sparsity
by: Haziza, Daniel, et al.
Published: (2025)
by: Haziza, Daniel, et al.
Published: (2025)
Generalized Simplicial Attention Neural Networks
by: Battiloro, Claudio, et al.
Published: (2023)
by: Battiloro, Claudio, et al.
Published: (2023)
Rethinking Thinking Tokens: LLMs as Improvement Operators
by: Madaan, Lovish, et al.
Published: (2025)
by: Madaan, Lovish, et al.
Published: (2025)
LUCID: Attention with Preconditioned Representations
by: Duvvuri, Sai Surya, et al.
Published: (2026)
by: Duvvuri, Sai Surya, et al.
Published: (2026)
Green MLOps: Closed-Loop, Energy-Aware Inference with NVIDIA Triton, FastAPI, and Bio-Inspired Thresholding
by: Hamdi, Mustapha, et al.
Published: (2026)
by: Hamdi, Mustapha, et al.
Published: (2026)
The Anatomy of a Triton Attention Kernel
by: Ringlein, Burkhard, et al.
Published: (2025)
by: Ringlein, Burkhard, et al.
Published: (2025)
Geak: Introducing Triton Kernel AI Agent & Evaluation Benchmarks
by: Wang, Jianghui, et al.
Published: (2025)
by: Wang, Jianghui, et al.
Published: (2025)
A Barrier Certificate-based Simplex Architecture for Systems with Approximate and Hybrid Dynamics
by: Damare, Amol, et al.
Published: (2022)
by: Damare, Amol, et al.
Published: (2022)
LLMs Judging LLMs: A Simplex Perspective
by: Vossler, Patrick, et al.
Published: (2025)
by: Vossler, Patrick, et al.
Published: (2025)
Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations
by: Liu, Wei, et al.
Published: (2026)
by: Liu, Wei, et al.
Published: (2026)
Simplicial SMOTE: Oversampling Solution to the Imbalanced Learning Problem
by: Kachan, Oleg, et al.
Published: (2025)
by: Kachan, Oleg, et al.
Published: (2025)
Liger Kernel: Efficient Triton Kernels for LLM Training
by: Hsu, Pin-Lun, et al.
Published: (2024)
by: Hsu, Pin-Lun, et al.
Published: (2024)
Simplex-enabled Safe Continual Learning Machine
by: Cao, Hongpeng, et al.
Published: (2024)
by: Cao, Hongpeng, et al.
Published: (2024)
Attention Smoothing Is All You Need For Unlearning
by: Zade, Saleh Zare, et al.
Published: (2026)
by: Zade, Saleh Zare, et al.
Published: (2026)
A Simplex Witness Certificate for Constant Collapse in Variational Autoencoders
by: Zhang, Zegu, et al.
Published: (2026)
by: Zhang, Zegu, et al.
Published: (2026)
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
by: Yan, Kai, et al.
Published: (2025)
by: Yan, Kai, et al.
Published: (2025)
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management
by: Lu, Miao, et al.
Published: (2025)
by: Lu, Miao, et al.
Published: (2025)
Trainable and Explainable Simplicial Map Neural Networks
by: Paluzo-Hidalgo, Eduardo, et al.
Published: (2023)
by: Paluzo-Hidalgo, Eduardo, et al.
Published: (2023)
Graph and Simplicial Complex Prediction Gaussian Process via the Hodgelet Representations
by: Alain, Mathieu, et al.
Published: (2025)
by: Alain, Mathieu, et al.
Published: (2025)
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex
by: Qu, Yun, et al.
Published: (2026)
by: Qu, Yun, et al.
Published: (2026)
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision
by: Shah, Jay, et al.
Published: (2024)
by: Shah, Jay, et al.
Published: (2024)
Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
by: Obando-Ceron, Johan, et al.
Published: (2025)
by: Obando-Ceron, Johan, et al.
Published: (2025)
Toward Adaptive Reasoning in Large Language Models with Thought Rollback
by: Chen, Sijia, et al.
Published: (2024)
by: Chen, Sijia, et al.
Published: (2024)
Efficient Distributed Optimization under Heavy-Tailed Noise
by: Lee, Su Hyeong, et al.
Published: (2025)
by: Lee, Su Hyeong, et al.
Published: (2025)
UMBCLU at SemEval-2024 Task 1A and 1C: Semantic Textual Relatedness with and without machine translation
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
A SUPERB-Style Benchmark of Self-Supervised Speech Models for Audio Deepfake Detection
by: Ali, Hashim, et al.
Published: (2026)
by: Ali, Hashim, et al.
Published: (2026)
Fast PINN Eigensolvers via Biconvex Reformulation
by: Banderwaar, Akshay Sai, et al.
Published: (2025)
by: Banderwaar, Akshay Sai, et al.
Published: (2025)
Topological Deep Learning with State-Space Models: A Mamba Approach for Simplicial Complexes
by: Montagna, Marco, et al.
Published: (2024)
by: Montagna, Marco, et al.
Published: (2024)
SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
by: Ji, Xiaodong, et al.
Published: (2025)
by: Ji, Xiaodong, et al.
Published: (2025)
MIR-Bench: Can Your LLM Recognize Complicated Patterns via Many-Shot In-Context Reasoning?
by: Yan, Kai, et al.
Published: (2025)
by: Yan, Kai, et al.
Published: (2025)
Scaling Multi-Node Mixture-of-Experts Inference Using Expert Activation Patterns
by: Bambhaniya, Abhimanyu, et al.
Published: (2026)
by: Bambhaniya, Abhimanyu, et al.
Published: (2026)
Similar Items
-
The Art of Scaling Reinforcement Learning Compute for LLMs
by: Khatri, Devvrit, et al.
Published: (2025) -
Differentially Private Model Merging
by: Yin, Qichuan, et al.
Published: (2026) -
A Statistical Framework for Data-dependent Retrieval-Augmented Models
by: Basu, Soumya, et al.
Published: (2024) -
Federation over Text: Insight Sharing for Multi-Agent Reasoning
by: Yao, Dixi, et al.
Published: (2026) -
Interleaved Head Attention
by: Duvvuri, Sai Surya, et al.
Published: (2026)