APA (7th ed.) Citation

Chen, Z., Liu, F., Zhu, X., Qi, Y., & Ghavamzadeh, M. (2025). Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator.

Chicago Style (17th ed.) Citation

Chen, Zhuotong, Fang Liu, Xuan Zhu, Yanjun Qi, and Mohammad Ghavamzadeh. Preference Optimization via Contrastive Divergence: Your Reward Model Is Secretly an NLL Estimator. 2025.

MLA (9th ed.) Citation

Chen, Zhuotong, et al. Preference Optimization via Contrastive Divergence: Your Reward Model Is Secretly an NLL Estimator. 2025.

Warning: These citations may not always be 100% accurate.