Saved in:
Bibliographic Details
Main Authors: Zhang, Haoran, Zhang, Wenhao, Wu, Xianping
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.23062
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This paper addresses the problem of dynamic asset allocation under uncertainty, which can be formulated as a linear quadratic (LQ) control problem with multiplicative noise. To handle exploration exploitation trade offs and induce sparse control actions, we introduce Tsallis entropy as a regularization term. We develop an entropy regularized policy iteration scheme and provide theoretical guarantees for its convergence. For cases where system dynamics are unknown, we further propose a fully data driven algorithm that estimates Q functions using an instrumental variable least squares approach, allowing efficient and stable policy updates. Our framework connects entropy-regularized stochastic control with model free reinforcement learning, offering new tools for intelligent decision making in finance and automation.