Shi, D., Glatt, R., Klymko, C., Mohole, S., Choi, H., Kushwaha, S., . . . da Silva, F. L. (2025). Oracle-RLAIF: An Improved Fine-Tuning Framework for Multi-modal Video Models through Reinforcement Learning from Ranking Feedback.
Chicago Style (17th ed.) CitationShi, Derek, Ruben Glatt, Christine Klymko, Shubham Mohole, Hongjun Choi, Shashank Kushwaha, Sam Sakla, and Felipe Leno da Silva. Oracle-RLAIF: An Improved Fine-Tuning Framework for Multi-modal Video Models Through Reinforcement Learning from Ranking Feedback. 2025.
MLA (9th ed.) CitationShi, Derek, et al. Oracle-RLAIF: An Improved Fine-Tuning Framework for Multi-modal Video Models Through Reinforcement Learning from Ranking Feedback. 2025.
Warning: These citations may not always be 100% accurate.