Saved in:
| Main Authors: | Chong, Hyochan, Kim, Dongkyu, Kim, Changdong, Choi, Minseop |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06694 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs
by: You, Youngcheon, et al.
Published: (2026)
by: You, Youngcheon, et al.
Published: (2026)
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)
by: Lee, Banseok, et al.
Published: (2025)
TruncQuant: Truncation-Ready Quantization for DNNs with Flexible Weight Bit Precision
by: Kim, Jinhee, et al.
Published: (2025)
by: Kim, Jinhee, et al.
Published: (2025)
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025)
by: Zhang, Tianao, et al.
Published: (2025)
FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)
by: Adepu, Harshavardhan, et al.
Published: (2024)
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
by: Kim, Jinuk, et al.
Published: (2025)
by: Kim, Jinuk, et al.
Published: (2025)
AffineQuant: Affine Transformation Quantization for Large Language Models
by: Ma, Yuexiao, et al.
Published: (2024)
by: Ma, Yuexiao, et al.
Published: (2024)
MoBiQuant: Mixture-of-Bits Quantization for Token-Adaptive Any-Precision LLM
by: Wang, Dongwei, et al.
Published: (2026)
by: Wang, Dongwei, et al.
Published: (2026)
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
by: Xiao, Guangxuan, et al.
Published: (2022)
by: Xiao, Guangxuan, et al.
Published: (2022)
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
by: Li, Ke, et al.
Published: (2026)
by: Li, Ke, et al.
Published: (2026)
SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
by: Shao, Wenqi, et al.
Published: (2023)
by: Shao, Wenqi, et al.
Published: (2023)
BAQ: Efficient Bit Allocation Quantization for Large Language Models
by: Zhang, Chao, et al.
Published: (2025)
by: Zhang, Chao, et al.
Published: (2025)
ContrastCAD: Contrastive Learning-based Representation Learning for Computer-Aided Design Models
by: Jung, Minseop, et al.
Published: (2024)
by: Jung, Minseop, et al.
Published: (2024)
pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training
by: Zhang, Wenzheng, et al.
Published: (2026)
by: Zhang, Wenzheng, et al.
Published: (2026)
MSQ: Memory-Efficient Bit Sparsification Quantization
by: Han, Seokho, et al.
Published: (2025)
by: Han, Seokho, et al.
Published: (2025)
A Lightweight CNN-Transformer Model for Learning Traveling Salesman Problems
by: Jung, Minseop, et al.
Published: (2023)
by: Jung, Minseop, et al.
Published: (2023)
SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
by: Zhao, Zhixiong, et al.
Published: (2025)
by: Zhao, Zhixiong, et al.
Published: (2025)
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
by: Chen, Mengzhao, et al.
Published: (2024)
by: Chen, Mengzhao, et al.
Published: (2024)
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
by: Zhang, Tianyi, et al.
Published: (2024)
by: Zhang, Tianyi, et al.
Published: (2024)
QQQ: Quality Quattuor-Bit Quantization for Large Language Models
by: Zhang, Ying, et al.
Published: (2024)
by: Zhang, Ying, et al.
Published: (2024)
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid
by: Zhang, Tianyi, et al.
Published: (2024)
by: Zhang, Tianyi, et al.
Published: (2024)
FibQuant: Universal Vector Quantization for Random-Access KV-Cache Compression
by: Lee, Namyoon, et al.
Published: (2026)
by: Lee, Namyoon, et al.
Published: (2026)
LittleBit-2: Maximizing the Spectral Energy Gain in Sub-1-Bit LLMs via Latent Geometry Alignment
by: Lee, Banseok, et al.
Published: (2026)
by: Lee, Banseok, et al.
Published: (2026)
Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization
by: Xi, Haocheng, et al.
Published: (2026)
by: Xi, Haocheng, et al.
Published: (2026)
LogQuant: Log-Distributed 2-Bit Quantization of KV Cache with Superior Accuracy Preservation
by: Chen, Han, et al.
Published: (2025)
by: Chen, Han, et al.
Published: (2025)
MergeQuant: Accurate 4-bit Static Quantization of Large Language Models by Channel-wise Calibration
by: Wang, Jinguang, et al.
Published: (2025)
by: Wang, Jinguang, et al.
Published: (2025)
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
by: Jeon, Hyesung, et al.
Published: (2024)
by: Jeon, Hyesung, et al.
Published: (2024)
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
by: Liu, Wenyuan, et al.
Published: (2024)
by: Liu, Wenyuan, et al.
Published: (2024)
ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models
by: Zeng, Chao, et al.
Published: (2024)
by: Zeng, Chao, et al.
Published: (2024)
How to Parameterize Asymmetric Quantization Ranges for Quantization-Aware Training
by: You, Jaeseong, et al.
Published: (2024)
by: You, Jaeseong, et al.
Published: (2024)
ApiQ: Finetuning of 2-Bit Quantized Large Language Model
by: Liao, Baohao, et al.
Published: (2024)
by: Liao, Baohao, et al.
Published: (2024)
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
by: Chee, Jerry, et al.
Published: (2023)
by: Chee, Jerry, et al.
Published: (2023)
ACFormer: Mitigating Non-linearity with Auto Convolutional Encoder for Time Series Forecasting
by: Lee, Gawon, et al.
Published: (2026)
by: Lee, Gawon, et al.
Published: (2026)
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
by: Shao, Yuantian, et al.
Published: (2025)
by: Shao, Yuantian, et al.
Published: (2025)
EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
by: Tang, Hanlin, et al.
Published: (2024)
by: Tang, Hanlin, et al.
Published: (2024)
Neural-network quantum state study of the long-range antiferromagnetic Ising chain
by: Kim, Jicheol, et al.
Published: (2023)
by: Kim, Jicheol, et al.
Published: (2023)
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025)
by: Hu, Xing, et al.
Published: (2025)
QuantLRM: Quantization of Large Reasoning Models via Fine-Tuning Signals
by: Zhang, Nan, et al.
Published: (2026)
by: Zhang, Nan, et al.
Published: (2026)
Similar Items
-
RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs
by: You, Youngcheon, et al.
Published: (2026) -
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025) -
TruncQuant: Truncation-Ready Quantization for DNNs with Flexible Weight Bit Precision
by: Kim, Jinhee, et al.
Published: (2025) -
Quant-dLLM: Post-Training Extreme Low-Bit Quantization for Diffusion Large Language Models
by: Zhang, Tianao, et al.
Published: (2025) -
FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)