APA (7th ed.) Citation

Li, L., Qian, Y., Zhao, P., & Zhou, Z. (2025). Provably Efficient Online RLHF with One-Pass Reward Modeling.

Chicago Style (17th ed.) Citation

Li, Long-Fei, Yu-Yang Qian, Peng Zhao, and Zhi-Hua Zhou. Provably Efficient Online RLHF with One-Pass Reward Modeling. 2025.

MLA (9th ed.) Citation

Li, Long-Fei, et al. Provably Efficient Online RLHF with One-Pass Reward Modeling. 2025.

Warning: These citations may not always be 100% accurate.