Wang, Z., Lou, X., Wu, M., Wen, Z., & Zhang, J. (2026). Calibration-Aware Policy Optimization for Reasoning LLMs.
Chicago Style (17th ed.) CitationWang, Ziqi, Xingzhou Lou, Meiqi Wu, Zhengqi Wen, and Junge Zhang. Calibration-Aware Policy Optimization for Reasoning LLMs. 2026.
MLA (9th ed.) CitationWang, Ziqi, et al. Calibration-Aware Policy Optimization for Reasoning LLMs. 2026.
Warning: These citations may not always be 100% accurate.