Guardado en:
| Autor principal: | Iyengar, Venugopalan |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2605.10655 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
StatQAT: Statistical Quantizer Optimization for Deep Networks
por: Aktukmak, Mehmet, et al.
Publicado: (2026)
por: Aktukmak, Mehmet, et al.
Publicado: (2026)
EfQAT: An Efficient Framework for Quantization-Aware Training
por: Ashkboos, Saleh, et al.
Publicado: (2024)
por: Ashkboos, Saleh, et al.
Publicado: (2024)
Attn-QAT: 4-Bit Attention With Quantization-Aware Training
por: Zhang, Peiyuan, et al.
Publicado: (2026)
por: Zhang, Peiyuan, et al.
Publicado: (2026)
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
por: Gernigon, Cédric, et al.
Publicado: (2024)
por: Gernigon, Cédric, et al.
Publicado: (2024)
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
por: Ke, Wenjin, et al.
Publicado: (2025)
por: Ke, Wenjin, et al.
Publicado: (2025)
StableQAT: Stable Quantization-Aware Training at Ultra-Low Bitwidths
por: Chen, Tianyi, et al.
Publicado: (2026)
por: Chen, Tianyi, et al.
Publicado: (2026)
MF-QAT: Multi-Format Quantization-Aware Training for Elastic Inference
por: Xu, Zifei, et al.
Publicado: (2026)
por: Xu, Zifei, et al.
Publicado: (2026)
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
por: Chen, Mengzhao, et al.
Publicado: (2024)
por: Chen, Mengzhao, et al.
Publicado: (2024)
1-Bit Wonder: Improving QAT Performance in the Low-Bit Regime through K-Means Quantization
por: Maskey, Sohir, et al.
Publicado: (2026)
por: Maskey, Sohir, et al.
Publicado: (2026)
DivQAT: Enhancing Robustness of Quantized Convolutional Neural Networks against Model Extraction Attacks
por: Khaled, Kacem, et al.
Publicado: (2025)
por: Khaled, Kacem, et al.
Publicado: (2025)
Optimizing Large Language Models through Quantization: A Comparative Analysis of PTQ and QAT Techniques
por: Hasan, Jahid
Publicado: (2024)
por: Hasan, Jahid
Publicado: (2024)
Trellis: Learning to Compress Key-Value Memory in Attention Models
por: Karami, Mahdi, et al.
Publicado: (2025)
por: Karami, Mahdi, et al.
Publicado: (2025)
Unifying Block-wise PTQ and Distillation-based QAT for Progressive Quantization toward 2-bit Instruction-Tuned LLMs
por: Lee, Jung Hyun, et al.
Publicado: (2025)
por: Lee, Jung Hyun, et al.
Publicado: (2025)
Efficient VQ-QAT and Mixed Vector/Linear quantized Neural Networks
por: Gou, Terry, et al.
Publicado: (2026)
por: Gou, Terry, et al.
Publicado: (2026)
Bit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMs
por: Xu, Binxing, et al.
Publicado: (2026)
por: Xu, Binxing, et al.
Publicado: (2026)
SLA2: Sparse-Linear Attention with Learnable Routing and QAT
por: Zhang, Jintao, et al.
Publicado: (2026)
por: Zhang, Jintao, et al.
Publicado: (2026)
Trust via Reputation of Conviction
por: Iyengar, Aravind R.
Publicado: (2026)
por: Iyengar, Aravind R.
Publicado: (2026)
CRVQ: Channel-Relaxed Vector Quantization for Extreme Compression of LLMs
por: Xu, Yuzhuang, et al.
Publicado: (2024)
por: Xu, Yuzhuang, et al.
Publicado: (2024)
A Differentiable Bayesian Relaxation for Latent Partial-Order Inference
por: Li, Dongqing, et al.
Publicado: (2026)
por: Li, Dongqing, et al.
Publicado: (2026)
Restructuring Vector Quantization with the Rotation Trick
por: Fifty, Christopher, et al.
Publicado: (2024)
por: Fifty, Christopher, et al.
Publicado: (2024)
Model-Free Approximate Bayesian Learning for Large-Scale Conversion Funnel Optimization
por: Iyengar, Garud, et al.
Publicado: (2024)
por: Iyengar, Garud, et al.
Publicado: (2024)
The Cost of Learning under Multiple Change Points
por: Gafni, Tomer, et al.
Publicado: (2026)
por: Gafni, Tomer, et al.
Publicado: (2026)
Learning the Pareto Front Using Bootstrapped Observation Samples
por: Kim, Wonyoung, et al.
Publicado: (2023)
por: Kim, Wonyoung, et al.
Publicado: (2023)
ADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language Models
por: Lucas, Ryan, et al.
Publicado: (2026)
por: Lucas, Ryan, et al.
Publicado: (2026)
Effect of Weight Quantization on Learning Models by Typical Case Analysis
por: Kashiwamura, Shuhei, et al.
Publicado: (2024)
por: Kashiwamura, Shuhei, et al.
Publicado: (2024)
A Giant-Step Baby-Step Classifier For Scalable and Real-Time Anomaly Detection In Industrial Control Systems and Water Treatment Systems
por: Venugopalan, Sarad, et al.
Publicado: (2025)
por: Venugopalan, Sarad, et al.
Publicado: (2025)
Quantized Delta Weight Is Safety Keeper
por: Liu, Yule, et al.
Publicado: (2024)
por: Liu, Yule, et al.
Publicado: (2024)
Analogy between Boltzmann machines and Feynman path integrals
por: Iyengar, Srinivasan S., et al.
Publicado: (2023)
por: Iyengar, Srinivasan S., et al.
Publicado: (2023)
Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training
por: Ahn, Myeonghwan, et al.
Publicado: (2025)
por: Ahn, Myeonghwan, et al.
Publicado: (2025)
AWP: Activation-Aware Weight Pruning and Quantization with Projected Gradient Descent
por: Liu, Jing, et al.
Publicado: (2025)
por: Liu, Jing, et al.
Publicado: (2025)
A2Q+: Improving Accumulator-Aware Weight Quantization
por: Colbert, Ian, et al.
Publicado: (2024)
por: Colbert, Ian, et al.
Publicado: (2024)
Generalized Radius and Integrated Codebook Transforms for Differentiable Vector Quantization
por: You, Haochen, et al.
Publicado: (2026)
por: You, Haochen, et al.
Publicado: (2026)
Optimizer's Information Criterion: Dissecting and Correcting Bias in Data-Driven Optimization
por: Iyengar, Garud, et al.
Publicado: (2023)
por: Iyengar, Garud, et al.
Publicado: (2023)
A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning
por: Giddens, Spencer, et al.
Publicado: (2023)
por: Giddens, Spencer, et al.
Publicado: (2023)
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
por: Müller, Lorenz K., et al.
Publicado: (2025)
por: Müller, Lorenz K., et al.
Publicado: (2025)
TruncQuant: Truncation-Ready Quantization for DNNs with Flexible Weight Bit Precision
por: Kim, Jinhee, et al.
Publicado: (2025)
por: Kim, Jinhee, et al.
Publicado: (2025)
CCQ: Convolutional Code for Extreme Low-bit Quantization in LLMs
por: Zhou, Zhaojing, et al.
Publicado: (2025)
por: Zhou, Zhaojing, et al.
Publicado: (2025)
Differentiable Search for Finding Optimal Quantization Strategy
por: Li, Lianqiang, et al.
Publicado: (2024)
por: Li, Lianqiang, et al.
Publicado: (2024)
Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing
por: Tang, Siao, et al.
Publicado: (2023)
por: Tang, Siao, et al.
Publicado: (2023)
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick
por: Vali, Mohammad Hassan, et al.
Publicado: (2025)
por: Vali, Mohammad Hassan, et al.
Publicado: (2025)
Ejemplares similares
-
StatQAT: Statistical Quantizer Optimization for Deep Networks
por: Aktukmak, Mehmet, et al.
Publicado: (2026) -
EfQAT: An Efficient Framework for Quantization-Aware Training
por: Ashkboos, Saleh, et al.
Publicado: (2024) -
Attn-QAT: 4-Bit Attention With Quantization-Aware Training
por: Zhang, Peiyuan, et al.
Publicado: (2026) -
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
por: Gernigon, Cédric, et al.
Publicado: (2024) -
DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models
por: Ke, Wenjin, et al.
Publicado: (2025)