Saved in:
| Main Authors: | Liu, Yutong, Zhao, Cairong, Hu, Guosheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.17417 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Comprehensive Study on Quantization Techniques for Large Language Models
by: Lang, Jiedong, et al.
Published: (2024)
by: Lang, Jiedong, et al.
Published: (2024)
Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications
by: Huynh, Nam, et al.
Published: (2025)
by: Huynh, Nam, et al.
Published: (2025)
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities
by: Zhou, Hao, et al.
Published: (2024)
by: Zhou, Hao, et al.
Published: (2024)
Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques
by: Mao, Qiheng, et al.
Published: (2024)
by: Mao, Qiheng, et al.
Published: (2024)
CPTQuant - A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
by: Nanda, Amitash, et al.
Published: (2024)
by: Nanda, Amitash, et al.
Published: (2024)
Mixed-Precision Quantization for Language Models: Techniques and Prospects
by: Rakka, Mariam, et al.
Published: (2025)
by: Rakka, Mariam, et al.
Published: (2025)
CBQ: Cross-Block Quantization for Large Language Models
by: Ding, Xin, et al.
Published: (2023)
by: Ding, Xin, et al.
Published: (2023)
Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques
by: Hasan, Jahid
Published: (2024)
by: Hasan, Jahid
Published: (2024)
Subspace Optimization for Large Language Models with Convergence Guarantees
by: He, Yutong, et al.
Published: (2024)
by: He, Yutong, et al.
Published: (2024)
Art and Science of Quantizing Large-Scale Models: A Comprehensive Overview
by: Wang, Yanshu, et al.
Published: (2024)
by: Wang, Yanshu, et al.
Published: (2024)
Saliency-Aware Regularized Quantization Calibration for Large Language Models
by: Zhao, Yanlong, et al.
Published: (2026)
by: Zhao, Yanlong, et al.
Published: (2026)
Statistically-Lossless Quantization of Large Language Models
by: Helcig, Michael, et al.
Published: (2026)
by: Helcig, Michael, et al.
Published: (2026)
Optimizing Large Language Model Training Using FP4 Quantization
by: Wang, Ruizhe, et al.
Published: (2025)
by: Wang, Ruizhe, et al.
Published: (2025)
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
by: Xu, Zukang, et al.
Published: (2025)
by: Xu, Zukang, et al.
Published: (2025)
QQQ: Quality Quattuor-Bit Quantization for Large Language Models
by: Zhang, Ying, et al.
Published: (2024)
by: Zhang, Ying, et al.
Published: (2024)
UniComp: A Unified Evaluation of Large Language Model Compression via Pruning, Quantization and Distillation
by: von Rad, Jonathan, et al.
Published: (2026)
by: von Rad, Jonathan, et al.
Published: (2026)
Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models
by: Chen, Kejia, et al.
Published: (2025)
by: Chen, Kejia, et al.
Published: (2025)
FBQuant: FeedBack Quantization for Large Language Models
by: Liu, Yijiang, et al.
Published: (2025)
by: Liu, Yijiang, et al.
Published: (2025)
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis
by: Zhao, Jiaqi, et al.
Published: (2025)
by: Zhao, Jiaqi, et al.
Published: (2025)
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
by: Zeng, Chao, et al.
Published: (2024)
by: Zeng, Chao, et al.
Published: (2024)
MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
by: Liu, Wenyuan, et al.
Published: (2025)
by: Liu, Wenyuan, et al.
Published: (2025)
Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs
by: Feng, Yushi, et al.
Published: (2025)
by: Feng, Yushi, et al.
Published: (2025)
ChemEval: A Comprehensive Multi-Level Chemical Evaluation for Large Language Models
by: Huang, Yuqing, et al.
Published: (2024)
by: Huang, Yuqing, et al.
Published: (2024)
Tequila: Trapping-free Ternary Quantization for Large Language Models
by: Huang, Hong, et al.
Published: (2025)
by: Huang, Hong, et al.
Published: (2025)
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
by: Shao, Wenqi, et al.
Published: (2023)
by: Shao, Wenqi, et al.
Published: (2023)
QuZO: Quantized Zeroth-Order Fine-Tuning for Large Language Models
by: Zhou, Jiajun, et al.
Published: (2025)
by: Zhou, Jiajun, et al.
Published: (2025)
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
by: Zeng, Chao, et al.
Published: (2024)
by: Zeng, Chao, et al.
Published: (2024)
GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models
by: Zhao, Pengxiang, et al.
Published: (2025)
by: Zhao, Pengxiang, et al.
Published: (2025)
Quantization of Large Language Models with an Overdetermined Basis
by: Merkulov, Daniil, et al.
Published: (2024)
by: Merkulov, Daniil, et al.
Published: (2024)
ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
by: Zhao, Weibo, et al.
Published: (2024)
by: Zhao, Weibo, et al.
Published: (2024)
FinLoRA: Finetuning Quantized Financial Large Language Models Using Low-Rank Adaptation
by: Wang, Dannong, et al.
Published: (2024)
by: Wang, Dannong, et al.
Published: (2024)
Large Language Models on Graphs: A Comprehensive Survey
by: Jin, Bowen, et al.
Published: (2023)
by: Jin, Bowen, et al.
Published: (2023)
Boost Post-Training Quantization via Null Space Optimization for Large Language Models
by: Zhao, Jiaqi, et al.
Published: (2025)
by: Zhao, Jiaqi, et al.
Published: (2025)
KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
by: Xu, Zukang, et al.
Published: (2026)
by: Xu, Zukang, et al.
Published: (2026)
On the Compressibility of Quantized Large Language Models
by: Mao, Yu, et al.
Published: (2024)
by: Mao, Yu, et al.
Published: (2024)
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
by: Liu, Wenyuan, et al.
Published: (2024)
by: Liu, Wenyuan, et al.
Published: (2024)
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
by: Zhang, Zhengxin, et al.
Published: (2024)
by: Zhang, Zhengxin, et al.
Published: (2024)
A Performance Evaluation of a Quantized Large Language Model on Various Smartphones
by: Çöplü, Tolga, et al.
Published: (2023)
by: Çöplü, Tolga, et al.
Published: (2023)
What Makes Quantization for Large Language Models Hard? An Empirical Study from the Lens of Perturbation
by: Gong, Zhuocheng, et al.
Published: (2024)
by: Gong, Zhuocheng, et al.
Published: (2024)
AffineQuant: Affine Transformation Quantization for Large Language Models
by: Ma, Yuexiao, et al.
Published: (2024)
by: Ma, Yuexiao, et al.
Published: (2024)
Similar Items
-
A Comprehensive Study on Quantization Techniques for Large Language Models
by: Lang, Jiedong, et al.
Published: (2024) -
Large Language Models for Code Generation: A Comprehensive Survey of Challenges, Techniques, Evaluation, and Applications
by: Huynh, Nam, et al.
Published: (2025) -
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities
by: Zhou, Hao, et al.
Published: (2024) -
Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques
by: Mao, Qiheng, et al.
Published: (2024) -
CPTQuant - A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
by: Nanda, Amitash, et al.
Published: (2024)