Xie, S., Chen, H., Yu, F., Sun, Z., Wu, X., & Hu, Y. (2024). Minor DPO reject penalty to increase training robustness.
Chicago Style (17th ed.) CitationXie, Shiming, Hong Chen, Fred Yu, Zeye Sun, Xiuyu Wu, and Yingfan Hu. Minor DPO Reject Penalty to Increase Training Robustness. 2024.
MLA (9th ed.) CitationXie, Shiming, et al. Minor DPO Reject Penalty to Increase Training Robustness. 2024.
Warning: These citations may not always be 100% accurate.