APA (7th ed.) Citation

Chidambaram, K., Krishnamurthy, S. K., Xu, Q., Hsiao, K., & Bhattacharya, M. (2026). Robust Post-Training for Generative Recommenders: Why Exponential Reward-Weighted SFT Outperforms RLHF.

Chicago Style (17th ed.) Citation

Chidambaram, Keertana, Sanath Kumar Krishnamurthy, Qiuling Xu, Ko-Jen Hsiao, and Moumita Bhattacharya. Robust Post-Training for Generative Recommenders: Why Exponential Reward-Weighted SFT Outperforms RLHF. 2026.

MLA (9th ed.) Citation

Chidambaram, Keertana, et al. Robust Post-Training for Generative Recommenders: Why Exponential Reward-Weighted SFT Outperforms RLHF. 2026.

Warning: These citations may not always be 100% accurate.