Ahmed, A. M., Rafailov, R., Sharkov, S., Li, X., & Koyejo, S. (2024). Scalable Ensembling For Mitigating Reward Overoptimisation.
Chicago Style (17th ed.) CitationAhmed, Ahmed M., Rafael Rafailov, Stepan Sharkov, Xuechen Li, and Sanmi Koyejo. Scalable Ensembling For Mitigating Reward Overoptimisation. 2024.
MLA (9th ed.) CitationAhmed, Ahmed M., et al. Scalable Ensembling For Mitigating Reward Overoptimisation. 2024.
Warning: These citations may not always be 100% accurate.