Yang, Z., Wang, Y., Li, R., & Sui, Z. (2026). Towards Better RL Training Data Utilization via Second-Order Rollout.
Chicago Style (17th ed.) CitationYang, Zhe, Yudong Wang, Rang Li, and Zhifang Sui. Towards Better RL Training Data Utilization via Second-Order Rollout. 2026.
MLA (9th ed.) CitationYang, Zhe, et al. Towards Better RL Training Data Utilization via Second-Order Rollout. 2026.
Warning: These citations may not always be 100% accurate.