:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Alimaskina, Ekaterina, Rudas, Darya, Shveykin, Denis, Molodtsov, Gleb, Vasiliev, Pavel, Beznosikov, Aleksandr
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2606.02011
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

HARP: Hadamard-Preconditioned Adaptive Rotation Processor for Extreme LLM Quantization
by: Zagitov, Artur, et al.
Published: (2026)

Hierarchical Mixture-of-Experts with Two-Stage Optimization
by: Molodtsov, Gleb, et al.
Published: (2026)

Optimal Data Splitting in Distributed Optimization for Machine Learning
by: Medyakov, Daniil, et al.
Published: (2024)

Effective Method with Compression for Distributed and Federated Cocoercive Variational Inequalities
by: Medyakov, Daniil, et al.
Published: (2024)

Shuffling Heuristic in Variational Inequalities: Establishing New Convergence Guarantees
by: Medyakov, Daniil, et al.
Published: (2025)

Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)

Thinking like a CHEMIST: Combined Heterogeneous Embedding Model Integrating Structure and Tokens
by: Rekut, Nikolai, et al.
Published: (2025)

Metropolis-Scale Road Network Datasets for Fine-Grained Urban Traffic Modeling
by: Velikonivtsev, Fedor, et al.
Published: (2025)

LCD: Advancing Extreme Low-Bit Clustering for Large Language Models via Knowledge Distillation
by: Liu, Fangxin, et al.
Published: (2025)

Communication-Efficient Federated Learning with Adaptive Number of Participants
by: Skorik, Sergey, et al.
Published: (2025)

Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)

Sign-SGD via Parameter-Free Optimization
by: Medyakov, Daniil, et al.
Published: (2025)

Quantization Meets Reasoning: Exploring and Mitigating Degradation of Low-Bit LLMs in Mathematical Reasoning
by: Li, Zhen, et al.
Published: (2025)

Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
by: Ding, Yifu, et al.
Published: (2026)

HESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMs
by: Wang, Guoan, et al.
Published: (2026)

LUT-DLA: Lookup Table as Efficient Extreme Low-Bit Deep Learning Accelerator
by: Li, Guoyu, et al.
Published: (2025)

PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
by: Zhao, Jiaqi, et al.
Published: (2025)

I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
by: Hu, Xing, et al.
Published: (2024)

Degradation Modeling and Prognostic Analysis Under Unknown Failure Modes
by: Fu, Ying, et al.
Published: (2024)

LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)

Revealing Interpretable Failure Modes of VLMs
by: Chaudhary, Isha, et al.
Published: (2026)

Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats
by: Zhao, Pengxiang, et al.
Published: (2026)

Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
by: Pavlov, Gorgi
Published: (2026)

Dynamic Low-rank Approximation of Full-Matrix Preconditioner for Training Generalized Linear Models
by: Matveeva, Tatyana, et al.
Published: (2025)

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models
by: Sakarvadia, Mansi, et al.
Published: (2023)

Large Language Model Reasoning Failures
by: Song, Peiyang, et al.
Published: (2026)

More Than Bits: Multi-Envelope Double Binary Factorization for Extreme Quantization
by: Ichikawa, Yuma, et al.
Published: (2025)

Bant: Byzantine Antidote via Trial Function and Trust Scores
by: Molodtsov, Gleb, et al.
Published: (2025)

What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
by: Lv, Keyu, et al.
Published: (2026)

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
by: Fadeeva, Ekaterina, et al.
Published: (2024)

Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
by: Zhang, Xi, et al.
Published: (2025)

InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
by: Li, Ke, et al.
Published: (2026)

Why Do Some Inputs Break Low-Bit LLM Quantization?
by: Chang, Ting-Yun, et al.
Published: (2025)

Pushing the Limits of Low-Bit Optimizers: A Focus on EMA Dynamics
by: Xu, Cong, et al.
Published: (2025)

SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
by: Song, Jaewoo, et al.
Published: (2025)

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters
by: Wang, Yiping, et al.
Published: (2025)

Bayesian Inverse Problems Meet Flow Matching: Efficient and Flexible Inference via Transformers
by: Sherki, Daniil, et al.
Published: (2025)

Solving Dual Sourcing Problems with Supply Mode Dependent Failure Rates
by: Akkerman, Fabian, et al.
Published: (2024)

MemFail: Stress-Testing Failure Modes of LLM Memory Systems
by: Garg, Ishir, et al.
Published: (2026)

Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
by: Feuer, Benjamin, et al.
Published: (2024)