APA (7th ed.) Citation

Scheid, A., Boursier, E., Durmus, A., Jordan, M. I., Ménard, P., Moulines, E., & Valko, M. (2024). Optimal Design for Reward Modeling in RLHF.

Chicago Style (17th ed.) Citation

Scheid, Antoine, Etienne Boursier, Alain Durmus, Michael I. Jordan, Pierre Ménard, Eric Moulines, and Michal Valko. Optimal Design for Reward Modeling in RLHF. 2024.

MLA (9th ed.) Citation

Scheid, Antoine, et al. Optimal Design for Reward Modeling in RLHF. 2024.

Warning: These citations may not always be 100% accurate.