Tan, Z., & Hong, Y. (2026). Self-Supervised On-Policy Distillation for Reasoning Language Models.
Chicago Style (17th ed.) CitationTan, Zhiquan, and Yinrong Hong. Self-Supervised On-Policy Distillation for Reasoning Language Models. 2026.
MLA (9th ed.) CitationTan, Zhiquan, and Yinrong Hong. Self-Supervised On-Policy Distillation for Reasoning Language Models. 2026.
Warning: These citations may not always be 100% accurate.