APA (7th ed.) Citation

Zhang, X., Ton, J., Shen, W., Wang, H., & Liu, Y. (2024). Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation.

Chicago Style (17th ed.) Citation

Zhang, Xiaoying, Jean-Francois Ton, Wei Shen, Hongning Wang, and Yang Liu. Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation. 2024.

MLA (9th ed.) Citation

Zhang, Xiaoying, et al. Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation. 2024.

Warning: These citations may not always be 100% accurate.