Saved in:
| Main Authors: | Zhou, Yuli, Chen, Qingxuan, Benini, Luca, Sun, Guolei, Li, Yawei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.02151 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
by: Guo, Hang, et al.
Published: (2025)
by: Guo, Hang, et al.
Published: (2025)
CamSAM2: Segment Anything Accurately in Camouflaged Videos
by: Zhou, Yuli, et al.
Published: (2025)
by: Zhou, Yuli, et al.
Published: (2025)
Direct Quantized Training of Language Models with Stochastic Rounding
by: Zhao, Kaiyan, et al.
Published: (2024)
by: Zhao, Kaiyan, et al.
Published: (2024)
Reparameterized LLM Training via Orthogonal Equivalence Transformation
by: Qiu, Zeju, et al.
Published: (2025)
by: Qiu, Zeju, et al.
Published: (2025)
When SAM2 Meets Video Camouflaged Object Segmentation: A Comprehensive Evaluation and Adaptation
by: Zhou, Yuli, et al.
Published: (2024)
by: Zhou, Yuli, et al.
Published: (2024)
Robust Training of Vector Quantized Bottleneck Models
by: Łańcucki, Adrian, et al.
Published: (2020)
by: Łańcucki, Adrian, et al.
Published: (2020)
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
by: You, Haoran, et al.
Published: (2024)
by: You, Haoran, et al.
Published: (2024)
RDKV: Rate-Distortion Bit Allocation for Joint Eviction and Quantization of the KV Cache
by: Zhang, Junkai, et al.
Published: (2026)
by: Zhang, Junkai, et al.
Published: (2026)
A Reparameterized Discrete Diffusion Model for Text Generation
by: Zheng, Lin, et al.
Published: (2023)
by: Zheng, Lin, et al.
Published: (2023)
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
by: Cheng, Wenhua, et al.
Published: (2023)
by: Cheng, Wenhua, et al.
Published: (2023)
End-to-End Training for Back-Translation with Categorical Reparameterization Trick
by: Heo, DongNyeong, et al.
Published: (2022)
by: Heo, DongNyeong, et al.
Published: (2022)
AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization
by: IslamBouli, Beshr, et al.
Published: (2026)
by: IslamBouli, Beshr, et al.
Published: (2026)
FlatQuant: Flatness Matters for LLM Quantization
by: Sun, Yuxuan, et al.
Published: (2024)
by: Sun, Yuxuan, et al.
Published: (2024)
CRVQ: Channel-Relaxed Vector Quantization for Extreme Compression of LLMs
by: Xu, Yuzhuang, et al.
Published: (2024)
by: Xu, Yuzhuang, et al.
Published: (2024)
SqueezeLLM: Dense-and-Sparse Quantization
by: Kim, Sehoon, et al.
Published: (2023)
by: Kim, Sehoon, et al.
Published: (2023)
Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptation
by: Sengupta, Ayan, et al.
Published: (2024)
by: Sengupta, Ayan, et al.
Published: (2024)
Revisiting Entropy Regularization: Adaptive Coefficient Unlocks Its Potential for LLM Reinforcement Learning
by: Zhang, Xiaoyun, et al.
Published: (2025)
by: Zhang, Xiaoyun, et al.
Published: (2025)
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
by: Xu, Zukang, et al.
Published: (2025)
by: Xu, Zukang, et al.
Published: (2025)
HiM2SAM: Enhancing SAM2 with Hierarchical Motion Estimation and Memory Optimization towards Long-term Tracking
by: Chen, Ruixiang, et al.
Published: (2025)
by: Chen, Ruixiang, et al.
Published: (2025)
MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
by: Wang, Dongwei, et al.
Published: (2026)
by: Wang, Dongwei, et al.
Published: (2026)
Technical Report: Activation Residual Hessian Quantization (ARHQ) for Low-Bit LLM Quantization
by: Wang, YiFeng, et al.
Published: (2026)
by: Wang, YiFeng, et al.
Published: (2026)
GPTVQ: The Blessing of Dimensionality for LLM Quantization
by: van Baalen, Mart, et al.
Published: (2024)
by: van Baalen, Mart, et al.
Published: (2024)
Round and Round We Go! What makes Rotary Positional Encodings useful?
by: Barbero, Federico, et al.
Published: (2024)
by: Barbero, Federico, et al.
Published: (2024)
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
by: Shao, Yuantian, et al.
Published: (2025)
by: Shao, Yuantian, et al.
Published: (2025)
Vector Quantized Latent Concepts: A Scalable Alternative to Clustering-Based Concept Discovery
by: Yu, Xuemin, et al.
Published: (2026)
by: Yu, Xuemin, et al.
Published: (2026)
Adaptive Task Vectors for Large Language Models
by: Kang, Joonseong, et al.
Published: (2025)
by: Kang, Joonseong, et al.
Published: (2025)
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
by: Zhang, Yihua, et al.
Published: (2024)
by: Zhang, Yihua, et al.
Published: (2024)
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
by: Liu, Di, et al.
Published: (2024)
by: Liu, Di, et al.
Published: (2024)
QUAD: Quantization and Parameter-Efficient Tuning of LLM with Activation Decomposition
by: Hu, Yuxuan, et al.
Published: (2025)
by: Hu, Yuxuan, et al.
Published: (2025)
Transformer-VQ: Linear-Time Transformers via Vector Quantization
by: Lingle, Lucas D.
Published: (2023)
by: Lingle, Lucas D.
Published: (2023)
Evaluation Hallucination in Multi-Round Incomplete Information Lateral-Driven Reasoning Tasks
by: Dong, Wenhan, et al.
Published: (2025)
by: Dong, Wenhan, et al.
Published: (2025)
Scaling Law for Quantization-Aware Training
by: Chen, Mengzhao, et al.
Published: (2025)
by: Chen, Mengzhao, et al.
Published: (2025)
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
by: Wang, Shida, et al.
Published: (2023)
by: Wang, Shida, et al.
Published: (2023)
Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention
by: Gao, Bin, et al.
Published: (2024)
by: Gao, Bin, et al.
Published: (2024)
A Framework for Cost-Effective and Self-Adaptive LLM Shaking and Recovery Mechanism
by: Chen, Zhiyu, et al.
Published: (2024)
by: Chen, Zhiyu, et al.
Published: (2024)
Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities
by: Li, Pengyi, et al.
Published: (2026)
by: Li, Pengyi, et al.
Published: (2026)
BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search
by: Sun, Linzhuang, et al.
Published: (2024)
by: Sun, Linzhuang, et al.
Published: (2024)
From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization
by: Zhou, Chenxi, et al.
Published: (2026)
by: Zhou, Chenxi, et al.
Published: (2026)
Trust in One Round: Confidence Estimation for Large Language Models via Structural Signals
by: Yang, Pengyue, et al.
Published: (2026)
by: Yang, Pengyue, et al.
Published: (2026)
Similar Items
-
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
by: Huang, Wei, et al.
Published: (2024) -
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
by: Guo, Hang, et al.
Published: (2025) -
CamSAM2: Segment Anything Accurately in Camouflaged Videos
by: Zhou, Yuli, et al.
Published: (2025) -
Direct Quantized Training of Language Models with Stochastic Rounding
by: Zhao, Kaiyan, et al.
Published: (2024) -
Reparameterized LLM Training via Orthogonal Equivalence Transformation
by: Qiu, Zeju, et al.
Published: (2025)