:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lin, Haokun, Xu, Haobo, Wu, Yichen, Cui, Jingzhi, Zhang, Yingtao, Mou, Linzhan, Song, Linqi, Sun, Zhenan, Wei, Ying
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.01721
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DuQuant++: Fine-grained Rotation Enhances Microscaling FP4 Quantization
by: Lin, Haokun, et al.
Published: (2026)

DopQ-ViT: Towards Distribution-Friendly and Outlier-Aware Post-Training Quantization for Vision Transformers
by: Yang, Lianwei, et al.
Published: (2024)

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
by: Lin, Haokun, et al.
Published: (2025)

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric
by: Lin, Haokun, et al.
Published: (2024)

LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Image and Video Generation
by: Yang, Lianwei, et al.
Published: (2025)

QuantDemoire: Quantization with Outlier Aware for Image Demoiréing
by: Chen, Zheng, et al.
Published: (2025)

QuantTune: Optimizing Model Quantization with Adaptive Outlier-Driven Fine Tuning
by: Chen, Jiun-Man, et al.
Published: (2024)

SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
by: Song, Jaewoo, et al.
Published: (2025)

PrefixQuant: Eliminating Outliers by Prefixed Tokens for Large Language Models Quantization
by: Chen, Mengzhao, et al.
Published: (2024)

QuantVLA: Scale-Calibrated Post-Training Quantization for Vision-Language-Action Models
by: Zhang, Jingxuan, et al.
Published: (2026)

Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion
by: Mou, Linzhan, et al.
Published: (2024)

Robust Robot Walker: Learning Agile Locomotion over Tiny Traps
by: Zhu, Shaoting, et al.
Published: (2024)

EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs
by: Tang, Hanlin, et al.
Published: (2024)

SplitQuant: Layer Splitting for Low-Bit Neural Network Quantization
by: Song, Jaewoo, et al.
Published: (2025)

CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts
by: Yin, Xiangyang, et al.
Published: (2026)

DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
by: Shao, Yuantian, et al.
Published: (2025)

SliderQuant: Accurate Post-Training Quantization for LLMs
by: Wang, Shigeng, et al.
Published: (2026)

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
by: Shen, Xuan, et al.
Published: (2023)

Accurate Block Quantization in LLMs with Outliers
by: Trukhanov, Nikita, et al.
Published: (2024)

Truncated Non-Uniform Quantization for Distributed SGD
by: Yan, Guangfeng, et al.
Published: (2024)

CalibQuant: 1-Bit KV Cache Quantization for Multimodal LLMs
by: Han, Insu, et al.
Published: (2025)

PolarQuant: Quantizing KV Caches with Polar Transformation
by: Han, Insu, et al.
Published: (2025)

FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)

MedREK: Retrieval-Based Editing for Medical LLMs with Key-Aware Prompts
by: Xia, Shujun, et al.
Published: (2025)

Graph and Skipped Transformer: Exploiting Spatial and Temporal Modeling Capacities for Efficient 3D Human Pose Estimation
by: Cui, Mengmeng, et al.
Published: (2024)

DynaQuant: Dynamic Mixed-Precision Quantization for Learned Image Compression
by: Bao, Youneng, et al.
Published: (2025)

Reverse Thinking Makes LLMs Stronger Reasoners
by: Chen, Justin Chih-Yao, et al.
Published: (2024)

Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs
by: Lu, Haiquan, et al.
Published: (2026)

NestQuant: Nested Lattice Quantization for Matrix Products and LLMs
by: Savkin, Semyon, et al.
Published: (2025)

DIMO: Diverse 3D Motion Generation for Arbitrary Objects
by: Mou, Linzhan, et al.
Published: (2025)

ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers
by: Cao, Hanwen, et al.
Published: (2025)

QuantAttack: Exploiting Dynamic Quantization to Attack Vision Transformers
by: Baras, Amit, et al.
Published: (2023)

DVD-Quant: Data-free Video Diffusion Transformers Quantization
by: Li, Zhiteng, et al.
Published: (2025)

AffineQuant: Affine Transformation Quantization for Large Language Models
by: Ma, Yuexiao, et al.
Published: (2024)

LLMEasyQuant: Scalable Quantization for Parallel and Distributed LLM Inference
by: Liu, Dong, et al.
Published: (2024)

Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning
by: Yan, Guangfeng, et al.
Published: (2024)

D$^2$Quant: Accurate Low-bit Post-Training Weight Quantization for LLMs
by: Yan, Xianglong, et al.
Published: (2026)

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
by: Lin, Haokun, et al.
Published: (2025)

OstQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting
by: Hu, Xing, et al.
Published: (2025)

PolarQuant: Leveraging Polar Transformation for Efficient Key Cache Quantization and Decoding Acceleration
by: Wu, Songhao, et al.
Published: (2025)