APA (7th ed.) Citation

Shu, Y., Wei, C., Lin, H., Qiu, S., & Xiong, H. (2026). Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent.

Chicago Style (17th ed.) Citation

Shu, Yao, Chenxing Wei, Hongbin Lin, Shuang Qiu, and Hui Xiong. Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent. 2026.

MLA (9th ed.) Citation

Shu, Yao, et al. Reference-Sampled Boltzmann Projection for KL-Regularized RLVR: Target-Matched Weighted SFT, Finite One-Shot Gaps, and Policy Mirror Descent. 2026.

Warning: These citations may not always be 100% accurate.