Saved in:
| Main Authors: | Alimaskina, Ekaterina, Rudas, Darya, Shveykin, Denis, Molodtsov, Gleb, Vasiliev, Pavel, Beznosikov, Aleksandr |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.02011 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HARP: Hadamard-Preconditioned Adaptive Rotation Processor for Extreme LLM Quantization
by: Zagitov, Artur, et al.
Published: (2026)
by: Zagitov, Artur, et al.
Published: (2026)
Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)
by: Molodtsov, Gleb, et al.
Published: (2026)
Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)
by: Medyakov, Daniil, et al.
Published: (2024)
Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities
by: Medyakov, Daniil, et al.
Published: (2024)
by: Medyakov, Daniil, et al.
Published: (2024)
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)
by: Rekut, Nikolai, et al.
Published: (2025)
Metropolis-Scale Road Network Datasets for Fine-Grained Urban Traffic Modeling
by: Velikonivtsev, Fedor, et al.
Published: (2025)
by: Velikonivtsev, Fedor, et al.
Published: (2025)
LCD: Advancing Extreme Low-Bit Clustering for Large Language Models via Knowledge Distillation
by: Liu, Fangxin, et al.
Published: (2025)
by: Liu, Fangxin, et al.
Published: (2025)
Communication-Efficient Federated Learning with Adaptive Number of Participants
by: Skorik, Sergey, et al.
Published: (2025)
by: Skorik, Sergey, et al.
Published: (2025)
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)
by: Zhang, Tianao, et al.
Published: (2025)
Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
by: Li, Zhen, et al.
Published: (2025)
by: Li, Zhen, et al.
Published: (2025)
Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
by: Ding, Yifu, et al.
Published: (2026)
by: Ding, Yifu, et al.
Published: (2026)
HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026)
by: Wang, Guoan, et al.
Published: (2026)
LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator
by: Li, Guoyu, et al.
Published: (2025)
by: Li, Guoyu, et al.
Published: (2025)
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
by: Zhao, Jiaqi, et al.
Published: (2025)
by: Zhao, Jiaqi, et al.
Published: (2025)
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
by: Hu, Xing, et al.
Published: (2024)
by: Hu, Xing, et al.
Published: (2024)
Degradation Modeling and Prognostic Analysis Under Unknown Failure Modes
by: Fu, Ying, et al.
Published: (2024)
by: Fu, Ying, et al.
Published: (2024)
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)
by: Lee, Banseok, et al.
Published: (2025)
Revealing Interpretable Failure Modes of VLMs
by: Chaudhary, Isha, et al.
Published: (2026)
by: Chaudhary, Isha, et al.
Published: (2026)
Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats
by: Zhao, Pengxiang, et al.
Published: (2026)
by: Zhao, Pengxiang, et al.
Published: (2026)
Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
by: Pavlov, Gorgi
Published: (2026)
by: Pavlov, Gorgi
Published: (2026)
Dynamic Low-rank Approximation of Full-Matrix Preconditioner for Training Generalized Linear Models
by: Matveeva, Tatyana, et al.
Published: (2025)
by: Matveeva, Tatyana, et al.
Published: (2025)
Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
by: Sakarvadia, Mansi, et al.
Published: (2023)
by: Sakarvadia, Mansi, et al.
Published: (2023)
Large Language Model Reasoning Failures
by: Song, Peiyang, et al.
Published: (2026)
by: Song, Peiyang, et al.
Published: (2026)
More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization
by: Ichikawa, Yuma, et al.
Published: (2025)
by: Ichikawa, Yuma, et al.
Published: (2025)
Bant: Byzantine Antidote via Trial Function and Trust Scores
by: Molodtsov, Gleb, et al.
Published: (2025)
by: Molodtsov, Gleb, et al.
Published: (2025)
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
by: Lv, Keyu, et al.
Published: (2026)
by: Lv, Keyu, et al.
Published: (2026)
Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
by: Fadeeva, Ekaterina, et al.
Published: (2024)
by: Fadeeva, Ekaterina, et al.
Published: (2024)
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
by: Zhang, Xi, et al.
Published: (2025)
by: Zhang, Xi, et al.
Published: (2025)
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
by: Li, Ke, et al.
Published: (2026)
by: Li, Ke, et al.
Published: (2026)
Why Do Some Inputs Break Low-Bit LLM Quantization?
by: Chang, Ting-Yun, et al.
Published: (2025)
by: Chang, Ting-Yun, et al.
Published: (2025)
Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
by: Xu, Cong, et al.
Published: (2025)
by: Xu, Cong, et al.
Published: (2025)
SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
by: Wang, Yiping, et al.
Published: (2025)
by: Wang, Yiping, et al.
Published: (2025)
Bayesian Inverse Problems Meet Flow Matching: Efficient and Flexible Inference via Transformers
by: Sherki, Daniil, et al.
Published: (2025)
by: Sherki, Daniil, et al.
Published: (2025)
Solving Dual Sourcing Problems with Supply Mode Dependent Failure Rates
by: Akkerman, Fabian, et al.
Published: (2024)
by: Akkerman, Fabian, et al.
Published: (2024)
MemFail: Stress-Testing Failure Modes of LLM Memory Systems
by: Garg, Ishir, et al.
Published: (2026)
by: Garg, Ishir, et al.
Published: (2026)
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
by: Feuer, Benjamin, et al.
Published: (2024)
by: Feuer, Benjamin, et al.
Published: (2024)
Similar Items
-
HARP: Hadamard-Preconditioned Adaptive Rotation Processor for Extreme LLM Quantization
by: Zagitov, Artur, et al.
Published: (2026) -
Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026) -
Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024) -
Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities
by: Medyakov, Daniil, et al.
Published: (2024) -
Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025)