Chaudhary, G., Behera, L., & Mondal, W. U. (2026). Match or Replay: Self Imitating Proximal Policy Optimization.
Chicago Style (17th ed.) CitationChaudhary, Gaurav, Laxmidhar Behera, and Washim Uddin Mondal. Match or Replay: Self Imitating Proximal Policy Optimization. 2026.
MLA (9th ed.) CitationChaudhary, Gaurav, et al. Match or Replay: Self Imitating Proximal Policy Optimization. 2026.
Warning: These citations may not always be 100% accurate.