Sahu, S., & Wells, M. T. (2025). Online Distributionally Robust LLM Alignment via Regression to Relative Reward.
Chicago Style (17th ed.) CitationSahu, Sharan, and Martin T. Wells. Online Distributionally Robust LLM Alignment via Regression to Relative Reward. 2025.
MLA (9th ed.) CitationSahu, Sharan, and Martin T. Wells. Online Distributionally Robust LLM Alignment via Regression to Relative Reward. 2025.
Warning: These citations may not always be 100% accurate.