Saved in:
| Main Authors: | Du, Dayou, Zhang, Yijia, Cao, Shijie, Guo, Jiaqi, Cao, Ting, Chu, Xiaowen, Xu, Ningyi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.10631 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
by: Du, Dayou, et al.
Published: (2025)
by: Du, Dayou, et al.
Published: (2025)
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
by: Dong, Peijie, et al.
Published: (2024)
by: Dong, Peijie, et al.
Published: (2024)
BitNet Distillation
by: Wu, Xun, et al.
Published: (2025)
by: Wu, Xun, et al.
Published: (2025)
Self-Filtered Distillation with LLMs-generated Trust Indicators for Reliable Patent Classification
by: Yoo, Yongmin, et al.
Published: (2025)
by: Yoo, Yongmin, et al.
Published: (2025)
AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models
by: Li, Zeyu, et al.
Published: (2025)
by: Li, Zeyu, et al.
Published: (2025)
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
by: Gao, Yizhao, et al.
Published: (2024)
by: Gao, Yizhao, et al.
Published: (2024)
Dissecting Bit-Level Scaling Laws in Quantizing Vision Generative Models
by: Ding, Xin, et al.
Published: (2025)
by: Ding, Xin, et al.
Published: (2025)
Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement
by: Ding, Peng, et al.
Published: (2025)
by: Ding, Peng, et al.
Published: (2025)
CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation
by: Zhang, Hongxuan, et al.
Published: (2024)
by: Zhang, Hongxuan, et al.
Published: (2024)
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
by: Wei, Jianyu, et al.
Published: (2024)
by: Wei, Jianyu, et al.
Published: (2024)
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)
by: Lee, Banseok, et al.
Published: (2025)
Efficient Knowledge Injection in LLMs via Self-Distillation
by: Kujanpää, Kalle, et al.
Published: (2024)
by: Kujanpää, Kalle, et al.
Published: (2024)
Self-Distilled RLVR
by: Yang, Chenxu, et al.
Published: (2026)
by: Yang, Chenxu, et al.
Published: (2026)
EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs
by: Lin, Liang, et al.
Published: (2026)
by: Lin, Liang, et al.
Published: (2026)
Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats
by: Zhao, Pengxiang, et al.
Published: (2026)
by: Zhao, Pengxiang, et al.
Published: (2026)
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
by: Zhang, Di, et al.
Published: (2026)
by: Zhang, Di, et al.
Published: (2026)
BitNet a4.8: 4-bit Activations for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2024)
by: Wang, Hongyu, et al.
Published: (2024)
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
by: Ouyang, Xu, et al.
Published: (2024)
by: Ouyang, Xu, et al.
Published: (2024)
Distill-C: Enhanced NL2SQL via Distilled Customization with LLMs
by: Hoang, Cong Duy Vu, et al.
Published: (2025)
by: Hoang, Cong Duy Vu, et al.
Published: (2025)
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
by: Shen, Zhenyi, et al.
Published: (2025)
by: Shen, Zhenyi, et al.
Published: (2025)
BitNet b1.58 2B4T Technical Report
by: Ma, Shuming, et al.
Published: (2025)
by: Ma, Shuming, et al.
Published: (2025)
LLM-FP4: 4-Bit Floating-Point Quantized Transformers
by: Liu, Shih-yang, et al.
Published: (2023)
by: Liu, Shih-yang, et al.
Published: (2023)
A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
by: Tang, Wei, et al.
Published: (2024)
by: Tang, Wei, et al.
Published: (2024)
BitDelta: Your Fine-Tune May Only Be Worth One Bit
by: Liu, James, et al.
Published: (2024)
by: Liu, James, et al.
Published: (2024)
RLKD: Distilling LLMs' Reasoning via Reinforcement Learning
by: Xu, Shicheng, et al.
Published: (2025)
by: Xu, Shicheng, et al.
Published: (2025)
Majority Bit-Aware Watermarking For Large Language Models
by: Xu, Jiahao, et al.
Published: (2025)
by: Xu, Jiahao, et al.
Published: (2025)
Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
by: Du, Dayou, et al.
Published: (2024)
by: Du, Dayou, et al.
Published: (2024)
Exploring Knowledge Purification in Multi-Teacher Knowledge Distillation for LLMs
by: Jin, Ruihan, et al.
Published: (2026)
by: Jin, Ruihan, et al.
Published: (2026)
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
by: Deschenaux, Justin, et al.
Published: (2024)
by: Deschenaux, Justin, et al.
Published: (2024)
Distilling Text Style Transfer With Self-Explanation From LLMs
by: Zhang, Chiyu, et al.
Published: (2024)
by: Zhang, Chiyu, et al.
Published: (2024)
X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs
by: Cao, Di, et al.
Published: (2026)
by: Cao, Di, et al.
Published: (2026)
Equational Bit-Vector Solving via Strong Gröbner Bases
by: Song, Jiaxin, et al.
Published: (2024)
by: Song, Jiaxin, et al.
Published: (2024)
Mapping the Schedule x Bit-Width Boundary in Sub-100M Quantisation-Aware Training
by: Thomassen, Christian Brandt
Published: (2026)
by: Thomassen, Christian Brandt
Published: (2026)
SignRoundV2: Toward Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
by: Cheng, Wenhua, et al.
Published: (2025)
by: Cheng, Wenhua, et al.
Published: (2025)
RM-Distiller: Exploiting Generative LLM for Reward Model Distillation
by: Zhou, Hongli, et al.
Published: (2026)
by: Zhou, Hongli, et al.
Published: (2026)
Bit-Vector CHC Solving for Binary Analysis and Binary Analysis for Bit-Vector CHC Solving
by: Bembenek, Aaron, et al.
Published: (2026)
by: Bembenek, Aaron, et al.
Published: (2026)
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
by: Boizard, Nicolas, et al.
Published: (2024)
by: Boizard, Nicolas, et al.
Published: (2024)
It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
by: Li, Bingdong, et al.
Published: (2024)
by: Li, Bingdong, et al.
Published: (2024)
BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2025)
by: Wang, Hongyu, et al.
Published: (2025)
Bit-level BPE: Below the byte boundary
by: Moon, Sangwhan, et al.
Published: (2025)
by: Moon, Sangwhan, et al.
Published: (2025)
Similar Items
-
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
by: Du, Dayou, et al.
Published: (2025) -
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
by: Dong, Peijie, et al.
Published: (2024) -
BitNet Distillation
by: Wu, Xun, et al.
Published: (2025) -
Self-Filtered Distillation with LLMs-generated Trust Indicators for Reliable Patent Classification
by: Yoo, Yongmin, et al.
Published: (2025) -
AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models
by: Li, Zeyu, et al.
Published: (2025)