Saved in:
| Main Author: | Linder, Max Rehman |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.15807 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
by: Wu, Taiqiang, et al.
Published: (2024)
by: Wu, Taiqiang, et al.
Published: (2024)
Better Estimation of the Kullback--Leibler Divergence Between Language Models
by: Amini, Afra, et al.
Published: (2025)
by: Amini, Afra, et al.
Published: (2025)
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models
by: Kim, Seungduk, et al.
Published: (2024)
by: Kim, Seungduk, et al.
Published: (2024)
Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation
by: Luong, Hoang-Chau, et al.
Published: (2026)
by: Luong, Hoang-Chau, et al.
Published: (2026)
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
by: Lv, Jiaming, et al.
Published: (2024)
by: Lv, Jiaming, et al.
Published: (2024)
Generalized Kullback-Leibler Divergence Loss
by: Cui, Jiequan, et al.
Published: (2025)
by: Cui, Jiequan, et al.
Published: (2025)
UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models
by: Dong, Yijiang River, et al.
Published: (2024)
by: Dong, Yijiang River, et al.
Published: (2024)
Evaluating the Efficacy of Large Language Models in Identifying Phishing Attempts
by: Patel, Het, et al.
Published: (2024)
by: Patel, Het, et al.
Published: (2024)
Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation
by: Cheng, Zihao, et al.
Published: (2026)
by: Cheng, Zihao, et al.
Published: (2026)
A Content-Based Framework for Cybersecurity Refusal Decisions in Large Language Models
by: Linder, Noa, et al.
Published: (2026)
by: Linder, Noa, et al.
Published: (2026)
Overcoming Vocabulary Mismatch: Vocabulary-agnostic Teacher Guided Language Modeling
by: Shin, Haebin, et al.
Published: (2025)
by: Shin, Haebin, et al.
Published: (2025)
SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language Models
by: Zhu, Hourun, et al.
Published: (2025)
by: Zhu, Hourun, et al.
Published: (2025)
Entropy and the Kullback-Leibler Divergence for Bayesian Networks: Computational Complexity and Efficient Implementation
by: Scutari, Marco
Published: (2023)
by: Scutari, Marco
Published: (2023)
Rule by Rule: Learning with Confidence through Vocabulary Expansion
by: Nössig, Albert, et al.
Published: (2024)
by: Nössig, Albert, et al.
Published: (2024)
OPSDL: On-Policy Self-Distillation for Long-Context Language Models
by: Zhang, Xinsen, et al.
Published: (2026)
by: Zhang, Xinsen, et al.
Published: (2026)
CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
Black-Box On-Policy Distillation of Large Language Models
by: Ye, Tianzhu, et al.
Published: (2025)
by: Ye, Tianzhu, et al.
Published: (2025)
Dual-Space Knowledge Distillation for Large Language Models
by: Zhang, Songming, et al.
Published: (2024)
by: Zhang, Songming, et al.
Published: (2024)
MiniLLM: On-Policy Distillation of Large Language Models
by: Gu, Yuxian, et al.
Published: (2023)
by: Gu, Yuxian, et al.
Published: (2023)
EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models
by: Wang, Chengyu, et al.
Published: (2025)
by: Wang, Chengyu, et al.
Published: (2025)
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
by: Zhao, Weilin, et al.
Published: (2025)
by: Zhao, Weilin, et al.
Published: (2025)
PLPP: Prompt Learning with Perplexity Is Self-Distillation for Vision-Language Models
by: Liu, Biao, et al.
Published: (2024)
by: Liu, Biao, et al.
Published: (2024)
UniSD: Towards a Unified Self-Distillation Framework for Large Language Models
by: Jin, Yiqiao, et al.
Published: (2026)
by: Jin, Yiqiao, et al.
Published: (2026)
Cross-Modal Knowledge Distillation for Speech Large Language Models
by: Wang, Enzhi, et al.
Published: (2025)
by: Wang, Enzhi, et al.
Published: (2025)
ELAD: Explanation-Guided Large Language Models Active Distillation
by: Zhang, Yifei, et al.
Published: (2024)
by: Zhang, Yifei, et al.
Published: (2024)
Distilling Event Sequence Knowledge From Large Language Models
by: Wadhwa, Somin, et al.
Published: (2024)
by: Wadhwa, Somin, et al.
Published: (2024)
Is Temperature the Creativity Parameter of Large Language Models?
by: Peeperkorn, Max, et al.
Published: (2024)
by: Peeperkorn, Max, et al.
Published: (2024)
Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
by: Pombal, José, et al.
Published: (2026)
by: Pombal, José, et al.
Published: (2026)
From Correction to Mastery: Reinforced Distillation of Large Language Model Agents
by: Lyu, Yuanjie, et al.
Published: (2025)
by: Lyu, Yuanjie, et al.
Published: (2025)
FedCoT: Federated Chain-of-Thought Distillation for Large Language Models
by: Fan, Tao, et al.
Published: (2024)
by: Fan, Tao, et al.
Published: (2024)
Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application
by: Yang, Chuanpeng, et al.
Published: (2024)
by: Yang, Chuanpeng, et al.
Published: (2024)
Gecko: Versatile Text Embeddings Distilled from Large Language Models
by: Lee, Jinhyuk, et al.
Published: (2024)
by: Lee, Jinhyuk, et al.
Published: (2024)
Key-Point-Driven Mathematical Reasoning Distillation of Large Language Model
by: Zhu, Xunyu, et al.
Published: (2024)
by: Zhu, Xunyu, et al.
Published: (2024)
QCRD: Quality-guided Contrastive Rationale Distillation for Large Language Models
by: Wang, Wei, et al.
Published: (2024)
by: Wang, Wei, et al.
Published: (2024)
ReAD: Reinforcement-Guided Capability Distillation for Large Language Models
by: Cheng, Xueqi, et al.
Published: (2026)
by: Cheng, Xueqi, et al.
Published: (2026)
LLMR: Knowledge Distillation with a Large Language Model-Induced Reward
by: Li, Dongheng, et al.
Published: (2024)
by: Li, Dongheng, et al.
Published: (2024)
Contextualization Distillation from Large Language Model for Knowledge Graph Completion
by: Li, Dawei, et al.
Published: (2024)
by: Li, Dawei, et al.
Published: (2024)
Delta Knowledge Distillation for Large Language Models
by: Cao, Yihan, et al.
Published: (2025)
by: Cao, Yihan, et al.
Published: (2025)
Structured Agent Distillation for Large Language Model
by: Liu, Jun, et al.
Published: (2025)
by: Liu, Jun, et al.
Published: (2025)
Large Language Models Explore by Latent Distilling
by: Zeng, Yuanhao, et al.
Published: (2026)
by: Zeng, Yuanhao, et al.
Published: (2026)
Similar Items
-
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models
by: Wu, Taiqiang, et al.
Published: (2024) -
Better Estimation of the Kullback--Leibler Divergence Between Language Models
by: Amini, Afra, et al.
Published: (2025) -
Efficient and Effective Vocabulary Expansion Towards Multilingual Large Language Models
by: Kim, Seungduk, et al.
Published: (2024) -
Diversity-Aware Reverse Kullback-Leibler Divergence for Large Language Model Distillation
by: Luong, Hoang-Chau, et al.
Published: (2026) -
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
by: Lv, Jiaming, et al.
Published: (2024)