:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Du, Dayou, Zhang, Yijia, Cao, Shijie, Guo, Jiaqi, Cao, Ting, Chu, Xiaowen, Xu, Ningyi
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2402.10631
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
by: Du, Dayou, et al.
Published: (2025)

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
by: Dong, Peijie, et al.
Published: (2024)

BitNet Distillation
by: Wu, Xun, et al.
Published: (2025)

Self-Filtered Distillation with LLMs-generated Trust Indicators for Reliable Patent Classification
by: Yoo, Yongmin, et al.
Published: (2025)

AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models
by: Li, Zeyu, et al.
Published: (2025)

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
by: Gao, Yizhao, et al.
Published: (2024)

Dissecting Bit-Level Scaling Laws in Quantizing Vision Generative Models
by: Ding, Xin, et al.
Published: (2025)

Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement
by: Ding, Peng, et al.
Published: (2025)

CSR:Achieving 1 Bit Key-Value Cache via Sparse Representation
by: Zhang, Hongxuan, et al.
Published: (2024)

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
by: Wei, Jianyu, et al.
Published: (2024)

LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)

Efficient Knowledge Injection in LLMs via Self-Distillation
by: Kujanpää, Kalle, et al.
Published: (2024)

Self-Distilled RLVR
by: Yang, Chenxu, et al.
Published: (2026)

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs
by: Lin, Liang, et al.
Published: (2026)

Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats
by: Zhao, Pengxiang, et al.
Published: (2026)

Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
by: Zhang, Di, et al.
Published: (2026)

BitNet a4.8: 4-bit Activations for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2024)

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
by: Ouyang, Xu, et al.
Published: (2024)

Distill-C: Enhanced NL2SQL via Distilled Customization with LLMs
by: Hoang, Cong Duy Vu, et al.
Published: (2025)

CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
by: Shen, Zhenyi, et al.
Published: (2025)

BitNet b1.58 2B4T Technical Report
by: Ma, Shuming, et al.
Published: (2025)

LLM-FP4: 4-Bit Floating-Point Quantized Transformers
by: Liu, Shih-yang, et al.
Published: (2023)

A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential
by: Tang, Wei, et al.
Published: (2024)

BitDelta: Your Fine-Tune May Only Be Worth One Bit
by: Liu, James, et al.
Published: (2024)

RLKD: Distilling LLMs' Reasoning via Reinforcement Learning
by: Xu, Shicheng, et al.
Published: (2025)

Majority Bit-Aware Watermarking For Large Language Models
by: Xu, Jiahao, et al.
Published: (2025)

Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey
by: Du, Dayou, et al.
Published: (2024)

Exploring Knowledge Purification in Multi-Teacher Knowledge Distillation for LLMs
by: Jin, Ruihan, et al.
Published: (2026)

Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
by: Deschenaux, Justin, et al.
Published: (2024)

Distilling Text Style Transfer With Self-Explanation From LLMs
by: Zhang, Chiyu, et al.
Published: (2024)

X-OPD: Cross-Modal On-Policy Distillation for Capability Alignment in Speech LLMs
by: Cao, Di, et al.
Published: (2026)

Equational Bit-Vector Solving via Strong Gröbner Bases
by: Song, Jiaxin, et al.
Published: (2024)

Mapping the Schedule x Bit-Width Boundary in Sub-100M Quantisation-Aware Training
by: Thomassen, Christian Brandt
Published: (2026)

SignRoundV2: Toward Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
by: Cheng, Wenhua, et al.
Published: (2025)

RM-Distiller: Exploiting Generative LLM for Reward Model Distillation
by: Zhou, Hongli, et al.
Published: (2026)

Bit-Vector CHC Solving for Binary Analysis and Binary Analysis for Bit-Vector CHC Solving
by: Bembenek, Aaron, et al.
Published: (2026)

Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
by: Boizard, Nicolas, et al.
Published: (2024)

It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
by: Li, Bingdong, et al.
Published: (2024)

BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs
by: Wang, Hongyu, et al.
Published: (2025)

Bit-level BPE: Below the byte boundary
by: Moon, Sangwhan, et al.
Published: (2025)