Zhang, G., Bao, H., & Kashima, H. (2024). Online Policy Learning from Offline Preferences.
Chicago Style (17th ed.) CitationZhang, Guoxi, Han Bao, and Hisashi Kashima. Online Policy Learning from Offline Preferences. 2024.
MLA (9th ed.) CitationZhang, Guoxi, et al. Online Policy Learning from Offline Preferences. 2024.
Warning: These citations may not always be 100% accurate.