He, Y., Wang, H., Jiang, Z., Papangelis, A., & Zhao, H. (2024). Semi-Supervised Reward Modeling via Iterative Self-Training.
Chicago Style (17th ed.) CitationHe, Yifei, Haoxiang Wang, Ziyan Jiang, Alexandros Papangelis, and Han Zhao. Semi-Supervised Reward Modeling via Iterative Self-Training. 2024.
MLA (9th ed.) CitationHe, Yifei, et al. Semi-Supervised Reward Modeling via Iterative Self-Training. 2024.
Warning: These citations may not always be 100% accurate.