APA (7th ed.) Citation

Khan, R. M. S., Liu, Z., Tan, Z., Fleming, C., & Chen, T. (2026). TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT.

Chicago Style (17th ed.) Citation

Khan, Rana Muhammad Shahroz, Zijie Liu, Zhen Tan, Charles Fleming, and Tianlong Chen. TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT. 2026.

MLA (9th ed.) Citation

Khan, Rana Muhammad Shahroz, et al. TMS: Trajectory-Mixed Supervision for Reward-Free, On-Policy SFT. 2026.

Warning: These citations may not always be 100% accurate.