Pace, A., Mallinson, J., Malmi, E., Krause, S., & Severyn, A. (2024). West-of-N: Synthetic Preferences for Self-Improving Reward Models.
Chicago Style (17th ed.) CitationPace, Alizée, Jonathan Mallinson, Eric Malmi, Sebastian Krause, and Aliaksei Severyn. West-of-N: Synthetic Preferences for Self-Improving Reward Models. 2024.
MLA (9th ed.) CitationPace, Alizée, et al. West-of-N: Synthetic Preferences for Self-Improving Reward Models. 2024.
Warning: These citations may not always be 100% accurate.