Dong, D., Chen, J., Jia, H., Wu, J., Di, H., Liu, J., . . . Wang, H. (2026). PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning.
Chicago Style (17th ed.) CitationDong, Daize, et al. PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning. 2026.
MLA (9th ed.) CitationDong, Daize, et al. PR2: Predictive Routing Replay for MoE-Based LLM Reinforcement Learning. 2026.
Warning: These citations may not always be 100% accurate.