Sahoo, S. (2025). The Good, The Bad, and The Hybrid: A Reward Structure Showdown in Reasoning Models Training.
Chicago Style (17th ed.) CitationSahoo, Subramanyam. The Good, The Bad, and The Hybrid: A Reward Structure Showdown in Reasoning Models Training. 2025.
MLA (9th ed.) CitationSahoo, Subramanyam. The Good, The Bad, and The Hybrid: A Reward Structure Showdown in Reasoning Models Training. 2025.
Warning: These citations may not always be 100% accurate.