Saved in:
| Main Authors: | von Rad, Jonathan, Cao, Yong, Geiger, Andreas |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.09130 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
UniComp: Rethinking Video Compression Through Informational Uniqueness
by: Yuan, Chao, et al.
Published: (2025)
by: Yuan, Chao, et al.
Published: (2025)
Investigating the Effect of Network Pruning on Performance and Interpretability
by: von Rad, Jonathan, et al.
Published: (2024)
by: von Rad, Jonathan, et al.
Published: (2024)
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
by: He, Haoyu, et al.
Published: (2025)
by: He, Haoyu, et al.
Published: (2025)
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
by: Chiang, Hung-Yueh, et al.
Published: (2025)
by: Chiang, Hung-Yueh, et al.
Published: (2025)
Prune-Quantize-Distill: An Ordered Pipeline for Efficient Neural Network Compression
by: Zhou, Longsheng, et al.
Published: (2026)
by: Zhou, Longsheng, et al.
Published: (2026)
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
by: Jin, Yiqiao, et al.
Published: (2026)
by: Jin, Yiqiao, et al.
Published: (2026)
UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation
by: Lu, Huimin, et al.
Published: (2025)
by: Lu, Huimin, et al.
Published: (2025)
PPC-GPT: Federated Task-Specific Compression of Large Language Models via Pruning and Chain-of-Thought Distillation
by: Fan, Tao, et al.
Published: (2025)
by: Fan, Tao, et al.
Published: (2025)
Extreme Compression of Large Language Models via Additive Quantization
by: Egiazarian, Vage, et al.
Published: (2024)
by: Egiazarian, Vage, et al.
Published: (2024)
QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models
by: Zhou, Changhai, et al.
Published: (2024)
by: Zhou, Changhai, et al.
Published: (2024)
On the Compressibility of Quantized Large Language Models
by: Mao, Yu, et al.
Published: (2024)
by: Mao, Yu, et al.
Published: (2024)
Soft Label Pruning and Quantization for Large-Scale Dataset Distillation
by: Lingao, Xiao, et al.
Published: (2026)
by: Lingao, Xiao, et al.
Published: (2026)
Compact Language Models via Pruning and Knowledge Distillation
by: Muralidharan, Saurav, et al.
Published: (2024)
by: Muralidharan, Saurav, et al.
Published: (2024)
TuneComp: Joint Fine-tuning and Compression for Large Foundation Models
by: Chen, Xiangyu, et al.
Published: (2025)
by: Chen, Xiangyu, et al.
Published: (2025)
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
by: Thangarasa, Vithursan, et al.
Published: (2024)
by: Thangarasa, Vithursan, et al.
Published: (2024)
Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe
by: Hou, Wenjin, et al.
Published: (2026)
by: Hou, Wenjin, et al.
Published: (2026)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
by: Gong, Ruihao, et al.
Published: (2024)
by: Gong, Ruihao, et al.
Published: (2024)
Lillama: Large Language Models Compression via Low-Rank Feature Distillation
by: Sy, Yaya, et al.
Published: (2024)
by: Sy, Yaya, et al.
Published: (2024)
UniPruning: Unifying Local Metric and Global Feedback for Scalable Sparse LLMs
by: Ding, Yizhuo, et al.
Published: (2025)
by: Ding, Yizhuo, et al.
Published: (2025)
Unified Stochastic Framework for Neural Network Quantization and Pruning
by: Zhang, Haoyu, et al.
Published: (2024)
by: Zhang, Haoyu, et al.
Published: (2024)
Spatio-Temporal Pruning for Compressed Spiking Large Language Models
by: Jiang, Yi, et al.
Published: (2025)
by: Jiang, Yi, et al.
Published: (2025)
EvoComp: Learning Visual Token Compression for Multimodal Large Language Models via Semantic-Guided Evolutionary Labeling
by: Song, Jiafei, et al.
Published: (2026)
by: Song, Jiafei, et al.
Published: (2026)
SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language Models
by: Zhu, Hourun, et al.
Published: (2025)
by: Zhu, Hourun, et al.
Published: (2025)
EPSD: Early Pruning with Self-Distillation for Efficient Model Compression
by: Chen, Dong, et al.
Published: (2024)
by: Chen, Dong, et al.
Published: (2024)
A Comprehensive Evaluation on Quantization Techniques for Large Language Models
by: Liu, Yutong, et al.
Published: (2025)
by: Liu, Yutong, et al.
Published: (2025)
UniMove: A Unified Model for Multi-city Human Mobility Prediction
by: Han, Chonghua, et al.
Published: (2025)
by: Han, Chonghua, et al.
Published: (2025)
StructComp: Substituting Propagation with Structural Compression in Training Graph Contrastive Learning
by: Zhang, Shengzhong, et al.
Published: (2023)
by: Zhang, Shengzhong, et al.
Published: (2023)
Foundations of Large Language Model Compression -- Part 1: Weight Quantization
by: Young, Sean I.
Published: (2024)
by: Young, Sean I.
Published: (2024)
Compression Scaling Laws:Unifying Sparsity and Quantization
by: Frantar, Elias, et al.
Published: (2025)
by: Frantar, Elias, et al.
Published: (2025)
UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation
by: Yang, Runzhao, et al.
Published: (2024)
by: Yang, Runzhao, et al.
Published: (2024)
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation
by: Zhang, Shu-Hao, et al.
Published: (2026)
by: Zhang, Shu-Hao, et al.
Published: (2026)
UniFlow: A Foundation Model for Unified Urban Spatio-Temporal Flow Prediction
by: Yuan, Yuan, et al.
Published: (2024)
by: Yuan, Yuan, et al.
Published: (2024)
Iterative Layer-wise Distillation for Efficient Compression of Large Language Models
by: Kovalev, Grigory, et al.
Published: (2025)
by: Kovalev, Grigory, et al.
Published: (2025)
CrossQuant: A Post-Training Quantization Method with Smaller Quantization Kernel for Precise Large Language Model Compression
by: Liu, Wenyuan, et al.
Published: (2024)
by: Liu, Wenyuan, et al.
Published: (2024)
GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models
by: Taneja, Maanas, et al.
Published: (2026)
by: Taneja, Maanas, et al.
Published: (2026)
Prune, Update and Trim: Robust Structured Pruning for Large Language Models
by: Mecke, Diego Coello de Portugal, et al.
Published: (2026)
by: Mecke, Diego Coello de Portugal, et al.
Published: (2026)
Pruning and Distilling Mixture-of-Experts into Dense Language Models
by: Kim, Junhyuck, et al.
Published: (2026)
by: Kim, Junhyuck, et al.
Published: (2026)
UniCO: Towards a Unified Model for Combinatorial Optimization Problems
by: Zong, Zefang, et al.
Published: (2025)
by: Zong, Zefang, et al.
Published: (2025)
Restoring Pruned Large Language Models via Lost Component Compensation
by: Feng, Zijian, et al.
Published: (2025)
by: Feng, Zijian, et al.
Published: (2025)
LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation
by: Song, Siqing, et al.
Published: (2026)
by: Song, Siqing, et al.
Published: (2026)
Similar Items
-
UniComp: Rethinking Video Compression Through Informational Uniqueness
by: Yuan, Chao, et al.
Published: (2025) -
Investigating the Effect of Network Pruning on Performance and Interpretability
by: von Rad, Jonathan, et al.
Published: (2024) -
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
by: He, Haoyu, et al.
Published: (2025) -
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs
by: Chiang, Hung-Yueh, et al.
Published: (2025) -
Prune-Quantize-Distill: An Ordered Pipeline for Efficient Neural Network Compression
by: Zhou, Longsheng, et al.
Published: (2026)