APA (7th ed.) Citation

Wang, Z., Bi, B., Zhu, Z., Mao, X., Wang, J., Wang, S., . . . Hong, L. (2024). UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function.

Chicago Style (17th ed.) Citation

Wang, Zhichao, Bin Bi, Zixu Zhu, Xiangbo Mao, Jun Wang, Shiyu Wang, Cheng Wang, Dong Nie, and Lingzi Hong. UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA Through a Generalized Implicit Reward Function. 2024.

MLA (9th ed.) Citation

Wang, Zhichao, et al. UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA Through a Generalized Implicit Reward Function. 2024.

Warning: These citations may not always be 100% accurate.