Zhang, S., Guo, X., Guo, R., Liu, S., Wang, X., Jiang, G., & Zhang, K. (2026). Answer First, Reason Later: Aligning Search Relevance via Mode-Balanced Reinforcement Learning.
Chicago Style (17th ed.) CitationZhang, Shijie, Xiang Guo, Rujun Guo, Shaoyu Liu, Xiaozhao Wang, Guanjun Jiang, and Kevin Zhang. Answer First, Reason Later: Aligning Search Relevance via Mode-Balanced Reinforcement Learning. 2026.
MLA (9th ed.) CitationZhang, Shijie, et al. Answer First, Reason Later: Aligning Search Relevance via Mode-Balanced Reinforcement Learning. 2026.
Warning: These citations may not always be 100% accurate.