Mathew, S., & Harshit, N. (2025). Counterfactual Reward Model Training for Bias Mitigation in Multimodal Reinforcement Learning.
Chicago Style (17th ed.) CitationMathew, Sheryl, and N. Harshit. Counterfactual Reward Model Training for Bias Mitigation in Multimodal Reinforcement Learning. 2025.
MLA (9th ed.) CitationMathew, Sheryl, and N. Harshit. Counterfactual Reward Model Training for Bias Mitigation in Multimodal Reinforcement Learning. 2025.
Warning: These citations may not always be 100% accurate.