Saved in:
| Main Authors: | Zagitov, Artur, Molodtsov, Gleb, Beznosikov, Aleksandr |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.29843 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)
by: Molodtsov, Gleb, et al.
Published: (2026)
Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
by: Alimaskina, Ekaterina, et al.
Published: (2026)
by: Alimaskina, Ekaterina, et al.
Published: (2026)
Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)
by: Medyakov, Daniil, et al.
Published: (2024)
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Communication-Efficient Federated Learning with Adaptive Number of Participants
by: Skorik, Sergey, et al.
Published: (2025)
by: Skorik, Sergey, et al.
Published: (2025)
HARP: Measuring Harm Amplification in Multi-Agent LLM Systems
by: Rahman, Md Hafizur, et al.
Published: (2026)
by: Rahman, Md Hafizur, et al.
Published: (2026)
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
by: Tseng, Albert, et al.
Published: (2024)
by: Tseng, Albert, et al.
Published: (2024)
Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)
by: Rekut, Nikolai, et al.
Published: (2025)
Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities
by: Medyakov, Daniil, et al.
Published: (2024)
by: Medyakov, Daniil, et al.
Published: (2024)
HARP: Hesitation-Aware Reframing in Transformer Inference Pass
by: Storaï, Romain, et al.
Published: (2024)
by: Storaï, Romain, et al.
Published: (2024)
Efficient and Adaptive Human Activity Recognition via LLM Backbones
by: Bredikhin, Aleksandr, et al.
Published: (2026)
by: Bredikhin, Aleksandr, et al.
Published: (2026)
Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
by: Pavlov, Gorgi
Published: (2026)
by: Pavlov, Gorgi
Published: (2026)
Compute-Optimal Quantization-Aware Training
by: Dremov, Aleksandr, et al.
Published: (2025)
by: Dremov, Aleksandr, et al.
Published: (2025)
Preconditioned Norms: A Unified Framework for Steepest Descent, Quasi-Newton and Adaptive Methods
by: Veprikov, Andrey, et al.
Published: (2025)
by: Veprikov, Andrey, et al.
Published: (2025)
Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning
by: Liu, Ziyue, et al.
Published: (2026)
by: Liu, Ziyue, et al.
Published: (2026)
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)
by: Zhang, Tianao, et al.
Published: (2025)
RotateKV: Accurate and Robust 2-Bit KV Cache Quantization for LLMs via Outlier-Aware Adaptive Rotations
by: Su, Zunhai, et al.
Published: (2025)
by: Su, Zunhai, et al.
Published: (2025)
Pushing the Limits of Block Rotations in Post-Training Quantization
by: Sanjeet, Sai, et al.
Published: (2026)
by: Sanjeet, Sai, et al.
Published: (2026)
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
by: Gautam, Arpit Singh, et al.
Published: (2026)
by: Gautam, Arpit Singh, et al.
Published: (2026)
TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization
by: Xu, Zukang, et al.
Published: (2026)
by: Xu, Zukang, et al.
Published: (2026)
Grouped Sequency-arranged Rotation: Optimizing Rotation Transformation for Quantization for Free
by: Choi, Euntae, et al.
Published: (2025)
by: Choi, Euntae, et al.
Published: (2025)
CALM: A CKA-Guided Adaptive Layer-Wise Modularization Framework for LLM Quantization
by: Zhang, Jinhao, et al.
Published: (2025)
by: Zhang, Jinhao, et al.
Published: (2025)
Bant: Byzantine Antidote via Trial Function and Trust Scores
by: Molodtsov, Gleb, et al.
Published: (2025)
by: Molodtsov, Gleb, et al.
Published: (2025)
Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven
by: Ben-Basat, Ran, et al.
Published: (2026)
by: Ben-Basat, Ran, et al.
Published: (2026)
Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer
by: Choi, Euntae, et al.
Published: (2025)
by: Choi, Euntae, et al.
Published: (2025)
Stochastic Gradient Methods with Preconditioned Updates
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
AIS: Adaptive Importance Sampling for Quantized RL
by: Zhou, Jiajun, et al.
Published: (2026)
by: Zhou, Jiajun, et al.
Published: (2026)
HARP: A Large-Scale Higher-Order Ambisonic Room Impulse Response Dataset
by: Saini, Shivam, et al.
Published: (2024)
by: Saini, Shivam, et al.
Published: (2024)
HARP: Human-Assisted Regrouping with Permutation Invariant Critic for Multi-Agent Reinforcement Learning
by: Hu, Huawen, et al.
Published: (2024)
by: Hu, Huawen, et al.
Published: (2024)
Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets
by: Bendinelli, Tommaso, et al.
Published: (2025)
by: Bendinelli, Tommaso, et al.
Published: (2025)
Exploiting LLM Quantization
by: Egashira, Kazuki, et al.
Published: (2024)
by: Egashira, Kazuki, et al.
Published: (2024)
SDQ: Sparse Decomposed Quantization for LLM Inference
by: Jeong, Geonhwa, et al.
Published: (2024)
by: Jeong, Geonhwa, et al.
Published: (2024)
MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
by: Wang, Dongwei, et al.
Published: (2026)
by: Wang, Dongwei, et al.
Published: (2026)
Neural Network Pruning via QUBO Optimization
by: Orabi, Osama, et al.
Published: (2026)
by: Orabi, Osama, et al.
Published: (2026)
ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
by: Liu, Zechun, et al.
Published: (2025)
by: Liu, Zechun, et al.
Published: (2025)
HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026)
by: Wang, Guoan, et al.
Published: (2026)
MambaQuant: Quantizing the Mamba Family with Variance Aligned Rotation Methods
by: Xu, Zukang, et al.
Published: (2025)
by: Xu, Zukang, et al.
Published: (2025)
LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization
by: Zhao, Juntao, et al.
Published: (2024)
by: Zhao, Juntao, et al.
Published: (2024)
Similar Items
-
Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026) -
Extreme Low-Bit Inference in Reasoning Models: Failure Modes and Targeted Recovery
by: Alimaskina, Ekaterina, et al.
Published: (2026) -
Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024) -
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025) -
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)