Guardado en:
| Autores principales: | Wang, Zhaoyang, Wang, Dong |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2511.05898 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
por: Choi, Kanghyun, et al.
Publicado: (2024)
por: Choi, Kanghyun, et al.
Publicado: (2024)
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
por: Luo, Róisín, et al.
Publicado: (2024)
por: Luo, Róisín, et al.
Publicado: (2024)
Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
por: Xiang, Ziwei, et al.
Publicado: (2026)
por: Xiang, Ziwei, et al.
Publicado: (2026)
Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision
por: Huang, Xijie, et al.
Publicado: (2023)
por: Huang, Xijie, et al.
Publicado: (2023)
Breaking Modality Heterogeneity in Low-Bit Quantization for Large Vision-Language Models
por: Zhong, Yi, et al.
Publicado: (2026)
por: Zhong, Yi, et al.
Publicado: (2026)
Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers
por: Ding, Rui, et al.
Publicado: (2024)
por: Ding, Rui, et al.
Publicado: (2024)
Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers
por: Chen, Lei, et al.
Publicado: (2024)
por: Chen, Lei, et al.
Publicado: (2024)
Collaborative Few-Step Distillation and Low-Bit Quantization for Wan2.2 Dual-Expert Video Diffusion Models
por: Du, Jinyang, et al.
Publicado: (2026)
por: Du, Jinyang, et al.
Publicado: (2026)
MARR: Module-Adaptive Residual Reconstruction for Low-Bit Post-Training Quantization
por: Su, Le, et al.
Publicado: (2026)
por: Su, Le, et al.
Publicado: (2026)
QuEPT: Quantized Elastic Precision Transformers with One-Shot Calibration for Multi-Bit Switching
por: Xu, Ke, et al.
Publicado: (2026)
por: Xu, Ke, et al.
Publicado: (2026)
Q-Sched: Pushing the Boundaries of Few-Step Diffusion Models with Quantization-Aware Scheduling
por: Frumkin, Natalia, et al.
Publicado: (2025)
por: Frumkin, Natalia, et al.
Publicado: (2025)
MBQ: Modality-Balanced Quantization for Large Vision-Language Models
por: Li, Shiyao, et al.
Publicado: (2024)
por: Li, Shiyao, et al.
Publicado: (2024)
Q-SAM2: Accurate Quantization for Segment Anything Model 2
por: Farronato, Nicola, et al.
Publicado: (2025)
por: Farronato, Nicola, et al.
Publicado: (2025)
When Bits Break Recourse: Counterfactual-Faithful Quantization
por: Yahyati, Chaymae, et al.
Publicado: (2026)
por: Yahyati, Chaymae, et al.
Publicado: (2026)
P4Q: Learning to Prompt for Quantization in Visual-language Models
por: Sun, Huixin, et al.
Publicado: (2024)
por: Sun, Huixin, et al.
Publicado: (2024)
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
por: Liu, Shih-yang, et al.
Publicado: (2023)
por: Liu, Shih-yang, et al.
Publicado: (2023)
QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
por: Wang, Xinhao, et al.
Publicado: (2026)
por: Wang, Xinhao, et al.
Publicado: (2026)
BTC-LLM: Efficient Sub-1-Bit LLM Quantization via Learnable Transformation and Binary Codebook
por: Gu, Hao, et al.
Publicado: (2025)
por: Gu, Hao, et al.
Publicado: (2025)
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
por: Ke, Wenjin, et al.
Publicado: (2025)
por: Ke, Wenjin, et al.
Publicado: (2025)
Channel-wise Vector Quantization
por: Song, Wei, et al.
Publicado: (2026)
por: Song, Wei, et al.
Publicado: (2026)
LUQ: Layerwise Ultra-Low Bit Quantization for Multimodal Large Language Models
por: Bhatnagar, Shubhang, et al.
Publicado: (2025)
por: Bhatnagar, Shubhang, et al.
Publicado: (2025)
Learning from Loss Landscape: Generalizable Mixed-Precision Quantization via Adaptive Sharpness-Aware Gradient Aligning
por: Ma, Lianbo, et al.
Publicado: (2025)
por: Ma, Lianbo, et al.
Publicado: (2025)
ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
por: Liu, Zechun, et al.
Publicado: (2025)
por: Liu, Zechun, et al.
Publicado: (2025)
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation
por: Wu, Zhuguanyu, et al.
Publicado: (2025)
por: Wu, Zhuguanyu, et al.
Publicado: (2025)
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
por: Aggarwal, Shivam, et al.
Publicado: (2023)
por: Aggarwal, Shivam, et al.
Publicado: (2023)
Modality-Aware and Anatomical Vector-Quantized Autoencoding for Multimodal Brain MRI
por: Li, Mingjie, et al.
Publicado: (2026)
por: Li, Mingjie, et al.
Publicado: (2026)
Quasar-ViT: Hardware-Oriented Quantization-Aware Architecture Search for Vision Transformers
por: Li, Zhengang, et al.
Publicado: (2024)
por: Li, Zhengang, et al.
Publicado: (2024)
Self-Supervised Quantization-Aware Knowledge Distillation
por: Zhao, Kaiqi, et al.
Publicado: (2024)
por: Zhao, Kaiqi, et al.
Publicado: (2024)
Scalable Image Tokenization with Index Backpropagation Quantization
por: Shi, Fengyuan, et al.
Publicado: (2024)
por: Shi, Fengyuan, et al.
Publicado: (2024)
Scaling Image Tokenizers with Grouped Spherical Quantization
por: Wang, Jiangtao, et al.
Publicado: (2024)
por: Wang, Jiangtao, et al.
Publicado: (2024)
AdaLoRA-QAT: Adaptive Low-Rank and Quantization-Aware Segmentation
por: Deb, Prantik, et al.
Publicado: (2026)
por: Deb, Prantik, et al.
Publicado: (2026)
SegQuant: A Semantics-Aware and Generalizable Quantization Framework for Diffusion Models
por: Zhang, Jiaji, et al.
Publicado: (2025)
por: Zhang, Jiaji, et al.
Publicado: (2025)
Quantization-Aware Neuromorphic Architecture for Skin Disease Classification on Resource-Constrained Devices
por: Wang, Haitian, et al.
Publicado: (2025)
por: Wang, Haitian, et al.
Publicado: (2025)
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
por: Lee, Jemin, et al.
Publicado: (2023)
por: Lee, Jemin, et al.
Publicado: (2023)
Efficient Quantization-Aware Training on Segment Anything Model in Medical Images and Its Deployment
por: Lu, Haisheng, et al.
Publicado: (2024)
por: Lu, Haisheng, et al.
Publicado: (2024)
Timestep-Aware SVDQuant-GPTQ for W4A4 Quantization of Wan2.2-I2V
por: Wu, Junhao, et al.
Publicado: (2026)
por: Wu, Junhao, et al.
Publicado: (2026)
Post-Training Quantization for Video Matting
por: Zhu, Tianrui, et al.
Publicado: (2025)
por: Zhu, Tianrui, et al.
Publicado: (2025)
PTQAT: A Hybrid Parameter-Efficient Quantization Algorithm for 3D Perception Tasks
por: Wang, Xinhao, et al.
Publicado: (2025)
por: Wang, Xinhao, et al.
Publicado: (2025)
Exploiting Information Redundancy in Attention Maps for Extreme Quantization of Vision Transformers
por: Maisonnave, Lucas, et al.
Publicado: (2025)
por: Maisonnave, Lucas, et al.
Publicado: (2025)
DC-PCN: Point Cloud Completion Network with Dual-Codebook Guided Quantization
por: Wu, Qiuxia, et al.
Publicado: (2025)
por: Wu, Qiuxia, et al.
Publicado: (2025)
Ejemplares similares
-
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity
por: Choi, Kanghyun, et al.
Publicado: (2024) -
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
por: Luo, Róisín, et al.
Publicado: (2024) -
Fine-Grained Post-Training Quantization for Large Vision Language Models with Quantization-Aware Integrated Gradients
por: Xiang, Ziwei, et al.
Publicado: (2026) -
Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precision
por: Huang, Xijie, et al.
Publicado: (2023) -
Breaking Modality Heterogeneity in Low-Bit Quantization for Large Vision-Language Models
por: Zhong, Yi, et al.
Publicado: (2026)