APA (7th ed.) Citation

Gao, J., Chen, J., He, C., Xu, S., Jin, D., & Wu, Y. (2026). From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents.

Chicago Style (17th ed.) Citation

Gao, Jiaxuan, Jiaao Chen, Chuyi He, Shusheng Xu, Di Jin, and Yi Wu. From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents. 2026.

MLA (9th ed.) Citation

Gao, Jiaxuan, et al. From Self-Evolving Synthetic Data to Verifiable-Reward RL: Post-Training Multi-turn Interactive Tool-Using Agents. 2026.

Warning: These citations may not always be 100% accurate.