Yu, K., Baek, B., & Lee, D. (2026). Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret.
Chicago Style (17th ed.) CitationYu, Kihyun, Beomhan Baek, and Dabeen Lee. Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret. 2026.
MLA (9th ed.) CitationYu, Kihyun, et al. Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret. 2026.
Warning: These citations may not always be 100% accurate.