:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rang, Miao, Bi, Zhenni, Zhou, Hang, Han, Kai, Wang, Xuechun, Xiao, An, Chen, Xinghao, Wang, Yunhe, Chen, Hanting
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2605.05940
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation
by: Rang, Miao, et al.
Published: (2025)

An Empirical Study of Scaling Law for OCR
by: Rang, Miao, et al.
Published: (2023)

Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
by: Rang, Miao, et al.
Published: (2025)

Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning
by: Bi, Zhenni, et al.
Published: (2024)

Nexus: Higher-Order Attention Mechanisms in Transformers
by: Chen, Hanting, et al.
Published: (2025)

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse
by: Nie, Ying, et al.
Published: (2025)

ROOT: Robust Orthogonalized Optimizer for Neural Network Training
by: He, Wei, et al.
Published: (2025)

Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression
by: Wang, Haoyu, et al.
Published: (2025)

DiJiang: Efficient Large Language Models through Compact Kernelization
by: Chen, Hanting, et al.
Published: (2024)

Deferred Commitment Decoding for Diffusion Language Models
by: Shu, Yingte, et al.
Published: (2026)

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
by: Guo, Jialong, et al.
Published: (2024)

Transferable text data distillation by trajectory matching
by: Yao, Rong, et al.
Published: (2025)

GhostRNN: Reducing State Redundancy in RNN with Cheap Operations
by: Zhou, Hang, et al.
Published: (2024)

From Next-Token to Next-Block: A Principled Adaptation Path for Diffusion LLMs
by: Tian, Yuchuan, et al.
Published: (2025)

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
by: Chen, Hanting, et al.
Published: (2025)

Multi-Granularity Semantic Revision for Large Language Model Distillation
by: Liu, Xiaoyu, et al.
Published: (2024)

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs
by: Chen, Hanting, et al.
Published: (2025)

The Extrapolation Cliff in On-Policy Distillation of Near-Deterministic Structured Outputs
by: Li, Xin, et al.
Published: (2026)

Multiscale Positive-Unlabeled Detection of AI-Generated Texts
by: Tian, Yuchuan, et al.
Published: (2023)

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection
by: He, Chaoqun, et al.
Published: (2026)

Surgical Post-Training: Proximal On-Policy Distillation for Reasoning with Knowledge Retention
by: Lin, Wenye, et al.
Published: (2026)

Bridging Reasoning Trajectories in On-Policy Distillation via Near-Future Guidance
by: Jiang, Yuxuan, et al.
Published: (2026)

Self-Policy Distillation via Capability-Selective Subspace Projection
by: Hao, Guangya, et al.
Published: (2026)

EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models
by: Ni, Yunsheng, et al.
Published: (2024)

An Empirical Study of World Model Quantization
by: Fu, Zhongqian, et al.
Published: (2026)

C-MOP: Integrating Momentum and Boundary-Aware Clustering for Enhanced Prompt Evolution
by: Yan, Binwei, et al.
Published: (2026)

MAIGO: Mitigating Lost-in-Conversation with History-Cleaned On-Policy Self-Distillation
by: Zheng, Haoyu, et al.
Published: (2026)

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity
by: Tang, Yehui, et al.
Published: (2025)

Scaling Reasoning Efficiently via Relaxed On-Policy Distillation
by: Ko, Jongwoo, et al.
Published: (2026)

MAD-OPD: Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate
by: Wang, Jianze, et al.
Published: (2026)

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
by: Guo, Jianyuan, et al.
Published: (2024)

Hybrid Policy Distillation for LLMs
by: Zhu, Wenhong, et al.
Published: (2026)

Trust Region On-Policy Distillation
by: Xing, Xingrun, et al.
Published: (2026)

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL
by: Wang, Sudong, et al.
Published: (2026)

Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
by: Tu, Zhijun, et al.
Published: (2025)

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification
by: Zhou, Yuhang, et al.
Published: (2026)

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
by: Yang, Wenkai, et al.
Published: (2026)

Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners
by: Xu, Xin, et al.
Published: (2025)

Internalize the Temperature: On-Policy Self-Distillation as Policy Reheater for Reinforcement Learning
by: Yang, Xuewei, et al.
Published: (2026)

Are Full Rollouts Necessary for On-Policy Distillation?
by: Zhang, Yaocheng, et al.
Published: (2026)