Saved in:
| Main Authors: | Lin, Haokun, Xu, Haobo, Wu, Yichen, Cui, Jingzhi, Zhang, Yingtao, Mou, Linzhan, Song, Linqi, Sun, Zhenan, Wei, Ying |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.01721 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization
by: Lin, Haokun, et al.
Published: (2026)
by: Lin, Haokun, et al.
Published: (2026)
DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers
by: Yang, Lianwei, et al.
Published: (2024)
by: Yang, Lianwei, et al.
Published: (2024)
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
by: Lin, Haokun, et al.
Published: (2025)
by: Lin, Haokun, et al.
Published: (2025)
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
by: Lin, Haokun, et al.
Published: (2024)
by: Lin, Haokun, et al.
Published: (2024)
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation
by: Yang, Lianwei, et al.
Published: (2025)
by: Yang, Lianwei, et al.
Published: (2025)
QuantDemoire: Quantization with Outlier Aware for Image Demoiréing
by: Chen, Zheng, et al.
Published: (2025)
by: Chen, Zheng, et al.
Published: (2025)
QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
by: Chen, Jiun-Man, et al.
Published: (2024)
by: Chen, Jiun-Man, et al.
Published: (2024)
SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
by: Chen, Mengzhao, et al.
Published: (2024)
by: Chen, Mengzhao, et al.
Published: (2024)
QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
by: Zhang, Jingxuan, et al.
Published: (2026)
by: Zhang, Jingxuan, et al.
Published: (2026)
Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion
by: Mou, Linzhan, et al.
Published: (2024)
by: Mou, Linzhan, et al.
Published: (2024)
Robust Robot Walker: Learning Agile Locomotion over Tiny Traps
by: Zhu, Shaoting, et al.
Published: (2024)
by: Zhu, Shaoting, et al.
Published: (2024)
EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
by: Tang, Hanlin, et al.
Published: (2024)
by: Tang, Hanlin, et al.
Published: (2024)
SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts
by: Yin, Xiangyang, et al.
Published: (2026)
by: Yin, Xiangyang, et al.
Published: (2026)
DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
by: Shao, Yuantian, et al.
Published: (2025)
by: Shao, Yuantian, et al.
Published: (2025)
SliderQuant: Accurate Post-Training Quantization for LLMs
by: Wang, Shigeng, et al.
Published: (2026)
by: Wang, Shigeng, et al.
Published: (2026)
Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
by: Shen, Xuan, et al.
Published: (2023)
by: Shen, Xuan, et al.
Published: (2023)
Accurate Block Quantization in LLMs with Outliers
by: Trukhanov, Nikita, et al.
Published: (2024)
by: Trukhanov, Nikita, et al.
Published: (2024)
Truncated Non-Uniform Quantization for Distributed SGD
by: Yan, Guangfeng, et al.
Published: (2024)
by: Yan, Guangfeng, et al.
Published: (2024)
CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs
by: Han, Insu, et al.
Published: (2025)
by: Han, Insu, et al.
Published: (2025)
PolarQuant: Quantizing KV Caches with Polar Transformation
by: Han, Insu, et al.
Published: (2025)
by: Han, Insu, et al.
Published: (2025)
FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)
by: Adepu, Harshavardhan, et al.
Published: (2024)
MedREK: Retrieval-Based Editing for Medical LLMs with Key-Aware Prompts
by: Xia, Shujun, et al.
Published: (2025)
by: Xia, Shujun, et al.
Published: (2025)
Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation
by: Cui, Mengmeng, et al.
Published: (2024)
by: Cui, Mengmeng, et al.
Published: (2024)
DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
by: Bao, Youneng, et al.
Published: (2025)
by: Bao, Youneng, et al.
Published: (2025)
Reverse Thinking Makes LLMs Stronger Reasoners
by: Chen, Justin Chih-Yao, et al.
Published: (2024)
by: Chen, Justin Chih-Yao, et al.
Published: (2024)
Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs
by: Lu, Haiquan, et al.
Published: (2026)
by: Lu, Haiquan, et al.
Published: (2026)
NestQuant: Nested Lattice Quantization for Matrix Products and LLMs
by: Savkin, Semyon, et al.
Published: (2025)
by: Savkin, Semyon, et al.
Published: (2025)
DIMO: Diverse 3D Motion Generation for Arbitrary Objects
by: Mou, Linzhan, et al.
Published: (2025)
by: Mou, Linzhan, et al.
Published: (2025)
ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers
by: Cao, Hanwen, et al.
Published: (2025)
by: Cao, Hanwen, et al.
Published: (2025)
QuantAttack: Exploiting Dynamic Quantization to Attack Vision Transformers
by: Baras, Amit, et al.
Published: (2023)
by: Baras, Amit, et al.
Published: (2023)
DVD-Quant: Data-free Video Diffusion Transformers Quantization
by: Li, Zhiteng, et al.
Published: (2025)
by: Li, Zhiteng, et al.
Published: (2025)
AffineQuant: Affine Transformation Quantization for Large Language Models
by: Ma, Yuexiao, et al.
Published: (2024)
by: Ma, Yuexiao, et al.
Published: (2024)
LLMEasyQuant: Scalable Quantization for Parallel and Distributed LLM Inference
by: Liu, Dong, et al.
Published: (2024)
by: Liu, Dong, et al.
Published: (2024)
Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning
by: Yan, Guangfeng, et al.
Published: (2024)
by: Yan, Guangfeng, et al.
Published: (2024)
D$^2$Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
by: Yan, Xianglong, et al.
Published: (2026)
by: Yan, Xianglong, et al.
Published: (2026)
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
by: Lin, Haokun, et al.
Published: (2025)
by: Lin, Haokun, et al.
Published: (2025)
OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025)
by: Hu, Xing, et al.
Published: (2025)
PolarQuant: Leveraging Polar Transformation for Efficient Key Cache Quantization and Decoding Acceleration
by: Wu, Songhao, et al.
Published: (2025)
by: Wu, Songhao, et al.
Published: (2025)
Similar Items
-
DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization
by: Lin, Haokun, et al.
Published: (2026) -
DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers
by: Yang, Lianwei, et al.
Published: (2024) -
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
by: Lin, Haokun, et al.
Published: (2025) -
MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
by: Lin, Haokun, et al.
Published: (2024) -
LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation
by: Yang, Lianwei, et al.
Published: (2025)