Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yin, Shuyu, Wen, Fei, Liu, Peilin, Luo, Tao
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2406.08148
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911914360045568
author	Yin, Shuyu Wen, Fei Liu, Peilin Luo, Tao
author_facet	Yin, Shuyu Wen, Fei Liu, Peilin Luo, Tao
contents	Semi-gradient Q-learning is applied in many fields, but due to the absence of an explicit loss function, studying its dynamics and implicit bias in the parameter space is challenging. This paper introduces the Fokker--Planck equation and employs partial data obtained through sampling to construct and visualize the effective loss landscape within a two-dimensional parameter space. This visualization reveals how the global minima in the loss landscape can transform into saddle points in the effective loss landscape, as well as the implicit bias of the semi-gradient method. Additionally, we demonstrate that saddle points, originating from the global minima in loss landscape, still exist in the effective loss landscape under high-dimensional parameter spaces and neural network settings. This paper develop a novel approach for probing implicit bias in semi-gradient Q-learning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_08148
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the Effective Loss Landscapes via the Fokker--Planck Equation Yin, Shuyu Wen, Fei Liu, Peilin Luo, Tao Machine Learning Artificial Intelligence Semi-gradient Q-learning is applied in many fields, but due to the absence of an explicit loss function, studying its dynamics and implicit bias in the parameter space is challenging. This paper introduces the Fokker--Planck equation and employs partial data obtained through sampling to construct and visualize the effective loss landscape within a two-dimensional parameter space. This visualization reveals how the global minima in the loss landscape can transform into saddle points in the effective loss landscape, as well as the implicit bias of the semi-gradient method. Additionally, we demonstrate that saddle points, originating from the global minima in loss landscape, still exist in the effective loss landscape under high-dimensional parameter spaces and neural network settings. This paper develop a novel approach for probing implicit bias in semi-gradient Q-learning.
title	Probing Implicit Bias in Semi-gradient Q-learning: Visualizing the Effective Loss Landscapes via the Fokker--Planck Equation
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2406.08148

Similar Items