:: Library Catalog

Beeld op de omslag

Bewaard in:

Bibliografische gegevens
Hoofdauteurs:	Lee, Seoungsub, Kim, In Seo, Kim, Seon Wook
Formaat:	Preprint
Gepubliceerd in:	2026
Onderwerpen:	Machine Learning Artificial Intelligence
Online toegang:	https://arxiv.org/abs/2604.04701
Tags:	Voeg label toe Geen labels, Wees de eerste die dit record labelt!

Gelijkaardige items

Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
door: Cho, Yoonjun, et al.
Gepubliceerd in: (2025)

Towards Scalable Handwriting Communication via EEG Decoding and Latent Embedding Integration
door: Kim, Jun-Young, et al.
Gepubliceerd in: (2024)

OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
door: Zhang, Stephen, et al.
Gepubliceerd in: (2024)

FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization
door: Kim, Seung-Wook, et al.
Gepubliceerd in: (2025)

ODIM: Outlier Detection via Likelihood of Under-Fitted Generative Models
door: Kim, Dongha, et al.
Gepubliceerd in: (2023)

No Token Left Behind: Reliable KV Cache Compression via Importance-Aware Mixed Precision Quantization
door: Yang, June Yong, et al.
Gepubliceerd in: (2024)

STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
door: Federici, Marco, et al.
Gepubliceerd in: (2025)

DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization
door: Lee, Dongyeun, et al.
Gepubliceerd in: (2025)

LittleBit: Ultra Low-Bit Quantization via Latent Factorization
door: Lee, Banseok, et al.
Gepubliceerd in: (2025)

Compressing Large Language Models using Low Rank and Low Precision Decomposition
door: Saha, Rajarshi, et al.
Gepubliceerd in: (2024)

LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
door: Lee, Jung Hyun, et al.
Gepubliceerd in: (2024)

Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs
door: Cho, Yoonjun, et al.
Gepubliceerd in: (2026)

Widening the Gap: Exploiting LLM Quantization via Outlier Injection
door: Zhan, Xiaohua, et al.
Gepubliceerd in: (2026)

Mixed-Precision Quantization for Language Models: Techniques and Prospects
door: Rakka, Mariam, et al.
Gepubliceerd in: (2025)

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression
door: Kim, Dohyun, et al.
Gepubliceerd in: (2025)

GlowQ: Group-Shared LOw-Rank Approximation for Quantized LLMs
door: An, Selim, et al.
Gepubliceerd in: (2026)

APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs
door: Bouzouad, Meriem, et al.
Gepubliceerd in: (2026)

AnyBCQ: Hardware Efficient Flexible Binary-Coded Quantization for Multi-Precision LLMs
door: Park, Gunho, et al.
Gepubliceerd in: (2025)

Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation
door: Ham, Seokil, et al.
Gepubliceerd in: (2024)

MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models
door: Liu, Wenyuan, et al.
Gepubliceerd in: (2025)

Low-Rank Tensor Decompositions for the Theory of Neural Networks
door: Borsoi, Ricardo, et al.
Gepubliceerd in: (2025)

MixKVQ: Query-Aware Mixed-Precision KV Cache Quantization for Long-Context Reasoning
door: Zhang, Tao, et al.
Gepubliceerd in: (2025)

Decoupling General and Personalized Knowledge in Federated Learning via Additive and Low-Rank Decomposition
door: Wu, Xinghao, et al.
Gepubliceerd in: (2024)

RILQ: Rank-Insensitive LoRA-based Quantization Error Compensation for Boosting 2-bit Large Language Model Accuracy
door: Lee, Geonho, et al.
Gepubliceerd in: (2024)

Low-Rank Quantization-Aware Training for LLMs
door: Bondarenko, Yelysei, et al.
Gepubliceerd in: (2024)

RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
door: Gautam, Arpit Singh, et al.
Gepubliceerd in: (2026)

EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation
door: Zhang, Shu-Hao, et al.
Gepubliceerd in: (2026)

Two-Stage Grid Optimization for Group-wise Quantization of LLMs
door: Kim, Junhan, et al.
Gepubliceerd in: (2026)

BoA: Attention-aware Post-training Quantization without Backpropagation
door: Kim, Junhan, et al.
Gepubliceerd in: (2024)

On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks
door: Huang, Wei, et al.
Gepubliceerd in: (2023)

GraLoRA: Granular Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
door: Jung, Yeonjoon, et al.
Gepubliceerd in: (2025)

Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs
door: Kaliaperumal, Pranav Kumar
Gepubliceerd in: (2026)

Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training
door: Varshney, Ayush K., et al.
Gepubliceerd in: (2026)

Optimal Policy Sparsification and Low Rank Decomposition for Deep Reinforcement Learning
door: Goddla, Vikram
Gepubliceerd in: (2024)

Fast and Low-Cost Genomic Foundation Models via Outlier Removal
door: Luo, Haozheng, et al.
Gepubliceerd in: (2025)

OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension
door: Zhang, Zhiyuan, et al.
Gepubliceerd in: (2026)

Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
door: Kim, Junhan, et al.
Gepubliceerd in: (2024)

Memory-Efficient Acceleration of Block Low-Rank Foundation Models on Resource Constrained GPUs
door: Abillama, Pierre, et al.
Gepubliceerd in: (2025)

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
door: Park, Jungwoo, et al.
Gepubliceerd in: (2025)

Deep Learning and Matrix Completion-aided IoT Network Localization in the Outlier Scenarios
door: Kim, Sunwoo
Gepubliceerd in: (2025)