Saved in:
| Main Authors: | Li, Zhikai, Liu, Xuewen, Zhang, Jing, Gu, Qingyi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.05628 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
OSAQ: Outlier Self-Absorption for Accurate Low-bit LLM Quantization
by: Li, Zhikai, et al.
Published: (2026)
by: Li, Zhikai, et al.
Published: (2026)
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models
by: Liu, Xuewen, et al.
Published: (2024)
by: Liu, Xuewen, et al.
Published: (2024)
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation
by: Liu, Xuewen, et al.
Published: (2024)
by: Liu, Xuewen, et al.
Published: (2024)
CacheQuant: Comprehensively Accelerated Diffusion Models
by: Liu, Xuewen, et al.
Published: (2025)
by: Liu, Xuewen, et al.
Published: (2025)
Sparsity Induction for Accurate Post-Training Pruning of Large Language Models
by: Jiang, Minhao, et al.
Published: (2026)
by: Jiang, Minhao, et al.
Published: (2026)
TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation
by: Xiao, Junrui, et al.
Published: (2024)
by: Xiao, Junrui, et al.
Published: (2024)
PTQ4ARVG: Post-Training Quantization for AutoRegressive Visual Generation Models
by: Liu, Xuewen, et al.
Published: (2026)
by: Liu, Xuewen, et al.
Published: (2026)
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
by: Xiao, Guangxuan, et al.
Published: (2022)
by: Xiao, Guangxuan, et al.
Published: (2022)
D$^2$Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
by: Yan, Xianglong, et al.
Published: (2026)
by: Yan, Xianglong, et al.
Published: (2026)
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
by: Zhang, Jingxuan, et al.
Published: (2026)
by: Zhang, Jingxuan, et al.
Published: (2026)
SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
by: Zhang, Jing, et al.
Published: (2025)
by: Zhang, Jing, et al.
Published: (2025)
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
by: Liu, Wenyuan, et al.
Published: (2024)
by: Liu, Wenyuan, et al.
Published: (2024)
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)
by: Zhang, Tianao, et al.
Published: (2025)
QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources
by: Li, Zhikai, et al.
Published: (2023)
by: Li, Zhikai, et al.
Published: (2023)
AffineQuant: Affine Transformation Quantization for Large Language Models
by: Ma, Yuexiao, et al.
Published: (2024)
by: Ma, Yuexiao, et al.
Published: (2024)
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
by: Zhang, Tianyi, et al.
Published: (2024)
by: Zhang, Tianyi, et al.
Published: (2024)
MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction
by: Yang, Lianwei, et al.
Published: (2024)
by: Yang, Lianwei, et al.
Published: (2024)
pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training
by: Zhang, Wenzheng, et al.
Published: (2026)
by: Zhang, Wenzheng, et al.
Published: (2026)
NestQuant: Post-Training Integer-Nesting Quantization for On-Device DNN
by: Xie, Jianhang, et al.
Published: (2025)
by: Xie, Jianhang, et al.
Published: (2025)
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
by: Kim, Junhan, et al.
Published: (2024)
by: Kim, Junhan, et al.
Published: (2024)
Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval
by: Zhang, Jing, et al.
Published: (2026)
by: Zhang, Jing, et al.
Published: (2026)
RepLoRA: Reparameterizing Low-Rank Adaptation via the Perspective of Mixture of Experts
by: Truong, Tuan, et al.
Published: (2025)
by: Truong, Tuan, et al.
Published: (2025)
QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024)
by: Li, Pingzhi, et al.
Published: (2024)
FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)
by: Adepu, Harshavardhan, et al.
Published: (2024)
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025)
by: Hu, Xing, et al.
Published: (2025)
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
by: Wang, Jinguang, et al.
Published: (2025)
by: Wang, Jinguang, et al.
Published: (2025)
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals
by: Zhang, Nan, et al.
Published: (2026)
by: Zhang, Nan, et al.
Published: (2026)
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
by: Shao, Wenqi, et al.
Published: (2023)
by: Shao, Wenqi, et al.
Published: (2023)
Reparameterized LLM Training via Orthogonal Equivalence Transformation
by: Qiu, Zeju, et al.
Published: (2025)
by: Qiu, Zeju, et al.
Published: (2025)
Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare
by: Li, Zhikai, et al.
Published: (2024)
by: Li, Zhikai, et al.
Published: (2024)
Scaling Laws for Post Training Quantized Large Language Models
by: Xu, Zifei, et al.
Published: (2024)
by: Xu, Zifei, et al.
Published: (2024)
Towards Accurate Post-training Quantization for Reparameterized Models
by: Zhang, Luoming, et al.
Published: (2024)
by: Zhang, Luoming, et al.
Published: (2024)
JacQuant: STE-Free Quantization-Aware Training via Learned Jacobian Surrogates
by: Yi, Kai, et al.
Published: (2026)
by: Yi, Kai, et al.
Published: (2026)
AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization
by: Zhang, Wenlun, et al.
Published: (2025)
by: Zhang, Wenlun, et al.
Published: (2025)
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
by: Kim, Jinuk, et al.
Published: (2025)
by: Kim, Jinuk, et al.
Published: (2025)
Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
by: Pham, Cuong, et al.
Published: (2025)
by: Pham, Cuong, et al.
Published: (2025)
Boost Post-Training Quantization via Null Space Optimization for Large Language Models
by: Zhao, Jiaqi, et al.
Published: (2025)
by: Zhao, Jiaqi, et al.
Published: (2025)
DBellQuant: Breaking the Bell with Double-Bell Transformation for LLMs Post Training Binarization
by: Ye, Zijian, et al.
Published: (2025)
by: Ye, Zijian, et al.
Published: (2025)
MatGPTQ: Accurate and Efficient Post-Training Matryoshka Quantization
by: Kleinegger, Maximilian, et al.
Published: (2026)
by: Kleinegger, Maximilian, et al.
Published: (2026)
PolarQuant: Quantizing KV Caches with Polar Transformation
by: Han, Insu, et al.
Published: (2025)
by: Han, Insu, et al.
Published: (2025)
Similar Items
-
OSAQ: Outlier Self-Absorption for Accurate Low-bit LLM Quantization
by: Li, Zhikai, et al.
Published: (2026) -
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models
by: Liu, Xuewen, et al.
Published: (2024) -
DilateQuant: Accurate and Efficient Diffusion Quantization via Weight Dilation
by: Liu, Xuewen, et al.
Published: (2024) -
CacheQuant: Comprehensively Accelerated Diffusion Models
by: Liu, Xuewen, et al.
Published: (2025) -
Sparsity Induction for Accurate Post-Training Pruning of Large Language Models
by: Jiang, Minhao, et al.
Published: (2026)