Raheja, T., & Pochhi, N. (2026). From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models.
Chicago Style (17th ed.) CitationRaheja, Tarun, and Nilay Pochhi. From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models. 2026.
MLA (9th ed.) CitationRaheja, Tarun, and Nilay Pochhi. From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models. 2026.
Warning: These citations may not always be 100% accurate.