Shen, W., Zhang, X., Yao, Y., Zheng, R., Guo, H., & Liu, Y. (2024). Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards.
Chicago Style (17th ed.) CitationShen, Wei, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, and Yang Liu. Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards. 2024.
MLA (9th ed.) CitationShen, Wei, et al. Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards. 2024.
Warning: These citations may not always be 100% accurate.