Jia, M., Zhang, Z., & Jiang, M. (2026). Prioritizing the Best: Incentivizing Reliable Multimodal Reasoning by Rewarding Beyond Answer Correctness.
Chicago Style (17th ed.) CitationJia, Mengzhao, Zhihan Zhang, and Meng Jiang. Prioritizing the Best: Incentivizing Reliable Multimodal Reasoning by Rewarding Beyond Answer Correctness. 2026.
MLA (9th ed.) CitationJia, Mengzhao, et al. Prioritizing the Best: Incentivizing Reliable Multimodal Reasoning by Rewarding Beyond Answer Correctness. 2026.
Warning: These citations may not always be 100% accurate.