:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zagitov, Artur, Molodtsov, Gleb, Beznosikov, Aleksandr
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.29843
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)

Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
by: Alimaskina, Ekaterina, et al.
Published: (2026)

Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)

Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025)

Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)

Communication-Efficient Federated Learning with Adaptive Number of Participants
by: Skorik, Sergey, et al.
Published: (2025)

HARP: Measuring Harm Amplification in Multi-Agent LLM Systems
by: Rahman, Md Hafizur, et al.
Published: (2026)

QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
by: Tseng, Albert, et al.
Published: (2024)

Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)

Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)

Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities
by: Medyakov, Daniil, et al.
Published: (2024)

HARP: Hesitation-Aware Reframing in Transformer Inference Pass
by: Storaï, Romain, et al.
Published: (2024)

Efficient and Adaptive Human Activity Recognition via LLM Backbones
by: Bredikhin, Aleksandr, et al.
Published: (2026)

Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
by: Pavlov, Gorgi
Published: (2026)

Compute-Optimal Quantization-Aware Training
by: Dremov, Aleksandr, et al.
Published: (2025)

Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
by: Veprikov, Andrey, et al.
Published: (2025)

Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning
by: Liu, Ziyue, et al.
Published: (2026)

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)

RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations
by: Su, Zunhai, et al.
Published: (2025)

Pushing the Limits of Block Rotations in Post-Training Quantization
by: Sanjeet, Sai, et al.
Published: (2026)

RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
by: Gautam, Arpit Singh, et al.
Published: (2026)

TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization
by: Xu, Zukang, et al.
Published: (2026)

Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
by: Choi, Euntae, et al.
Published: (2025)

CALM: A CKA-Guided Adaptive Layer-Wise Modularization Framework for LLM Quantization
by: Zhang, Jinhao, et al.
Published: (2025)

Bant: Byzantine Antidote via Trial Function and Trust Scores
by: Molodtsov, Gleb, et al.
Published: (2025)

Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven
by: Ben-Basat, Ran, et al.
Published: (2026)

Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer
by: Choi, Euntae, et al.
Published: (2025)

Stochastic Gradient Methods with Preconditioned Updates
by: Sadiev, Abdurakhmon, et al.
Published: (2022)

AIS: Adaptive Importance Sampling for Quantized RL
by: Zhou, Jiajun, et al.
Published: (2026)

HARP: A Large-Scale Higher-Order Ambisonic Room Impulse Response Dataset
by: Saini, Shivam, et al.
Published: (2024)

HARP: Human-Assisted Regrouping with Permutation Invariant Critic for Multi-Agent Reinforcement Learning
by: Hu, Huawen, et al.
Published: (2024)

Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets
by: Bendinelli, Tommaso, et al.
Published: (2025)

Exploiting LLM Quantization
by: Egashira, Kazuki, et al.
Published: (2024)

SDQ: Sparse Decomposed Quantization for LLM Inference
by: Jeong, Geonhwa, et al.
Published: (2024)

MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
by: Wang, Dongwei, et al.
Published: (2026)

Neural Network Pruning via QUBO Optimization
by: Orabi, Osama, et al.
Published: (2026)

ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
by: Liu, Zechun, et al.
Published: (2025)

HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026)

MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
by: Xu, Zukang, et al.
Published: (2025)

LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
by: Zhao, Juntao, et al.
Published: (2024)