Record Citations :: Library Catalog

APA (7th ed.) Citation

Wang, Z. (2026). GRPO and Reflection Reward for Mathematical Reasoning in Large Language Models.

Chicago Style (17th ed.) Citation

Wang, Zhijie. GRPO and Reflection Reward for Mathematical Reasoning in Large Language Models. 2026.

MLA (9th ed.) Citation

Wang, Zhijie. GRPO and Reflection Reward for Mathematical Reasoning in Large Language Models. 2026.

Warning: These citations may not always be 100% accurate.