APA (7th ed.) Citation

Raheja, T., & Pochhi, N. (2026). From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models.

Chicago Style (17th ed.) Citation

Raheja, Tarun, and Nilay Pochhi. From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models. 2026.

MLA (9th ed.) Citation

Raheja, Tarun, and Nilay Pochhi. From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models. 2026.

Warning: These citations may not always be 100% accurate.