Saved in:
Bibliographic Details
Main Authors: Jiang, Chenyang, Kim, Donggyu, Quintos, Alejandra, Wang, Yazhen
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2411.11697
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918143144755200
author Jiang, Chenyang
Kim, Donggyu
Quintos, Alejandra
Wang, Yazhen
author_facet Jiang, Chenyang
Kim, Donggyu
Quintos, Alejandra
Wang, Yazhen
contents Reinforcement Learning (RL) has proven effective in solving complex decision-making tasks across various domains, but challenges remain in continuous-time settings, particularly when state dynamics are governed by stochastic differential equations (SDEs) with jump components. In this paper, we address this challenge by introducing the Mean-Square Bipower Variation Error (MSBVE) algorithm, which enhances robustness and convergence in scenarios involving significant stochastic noise and jumps. We first revisit the Mean-Square TD Error (MSTDE) algorithm, commonly used in continuous-time RL, and highlight its limitations in handling jumps in state dynamics. The proposed MSBVE algorithm minimizes the mean-square quadratic variation error, offering improved performance over MSTDE in environments characterized by SDEs with jumps. Simulations and formal proofs demonstrate that the MSBVE algorithm reliably estimates the value function in complex settings, surpassing MSTDE's performance when faced with jump processes. These findings underscore the importance of alternative error metrics to improve the resilience and effectiveness of RL algorithms in continuous-time frameworks.
format Preprint
id arxiv_https___arxiv_org_abs_2411_11697
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Robust Reinforcement Learning under Diffusion Models for Data with Jumps
Jiang, Chenyang
Kim, Donggyu
Quintos, Alejandra
Wang, Yazhen
Machine Learning
Reinforcement Learning (RL) has proven effective in solving complex decision-making tasks across various domains, but challenges remain in continuous-time settings, particularly when state dynamics are governed by stochastic differential equations (SDEs) with jump components. In this paper, we address this challenge by introducing the Mean-Square Bipower Variation Error (MSBVE) algorithm, which enhances robustness and convergence in scenarios involving significant stochastic noise and jumps. We first revisit the Mean-Square TD Error (MSTDE) algorithm, commonly used in continuous-time RL, and highlight its limitations in handling jumps in state dynamics. The proposed MSBVE algorithm minimizes the mean-square quadratic variation error, offering improved performance over MSTDE in environments characterized by SDEs with jumps. Simulations and formal proofs demonstrate that the MSBVE algorithm reliably estimates the value function in complex settings, surpassing MSTDE's performance when faced with jump processes. These findings underscore the importance of alternative error metrics to improve the resilience and effectiveness of RL algorithms in continuous-time frameworks.
title Robust Reinforcement Learning under Diffusion Models for Data with Jumps
topic Machine Learning
url https://arxiv.org/abs/2411.11697