Yang, Q., Ni, B., Xiang, S., Hu, H., Peng, H., & Jiang, J. (2025). R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning.
Chicago Style (17th ed.) CitationYang, Qi, Bolin Ni, Shiming Xiang, Han Hu, Houwen Peng, and Jie Jiang. R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning. 2025.
MLA (9th ed.) CitationYang, Qi, et al. R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning. 2025.
Warning: These citations may not always be 100% accurate.