Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lau, Elaine, Lu, Stephen Zhewen, Pan, Ling, Precup, Doina, Bengio, Emmanuel
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2402.05234
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916463840854016
author	Lau, Elaine Lu, Stephen Zhewen Pan, Ling Precup, Doina Bengio, Emmanuel
author_facet	Lau, Elaine Lu, Stephen Zhewen Pan, Ling Precup, Doina Bengio, Emmanuel
contents	Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate, $Q$, to create greedier sampling policies which can be controlled by a mixing parameter. We show that several variants of the proposed method, QGFN, are able to improve on the number of high-reward samples generated in a variety of tasks without sacrificing diversity.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_05234
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	QGFN: Controllable Greediness with Action Values Lau, Elaine Lu, Stephen Zhewen Pan, Ling Precup, Doina Bengio, Emmanuel Machine Learning Generative Flow Networks (GFlowNets; GFNs) are a family of energy-based generative methods for combinatorial objects, capable of generating diverse and high-utility samples. However, consistently biasing GFNs towards producing high-utility samples is non-trivial. In this work, we leverage connections between GFNs and reinforcement learning (RL) and propose to combine the GFN policy with an action-value estimate, $Q$, to create greedier sampling policies which can be controlled by a mixing parameter. We show that several variants of the proposed method, QGFN, are able to improve on the number of high-reward samples generated in a variety of tasks without sacrificing diversity.
title	QGFN: Controllable Greediness with Action Values
topic	Machine Learning
url	https://arxiv.org/abs/2402.05234

Similar Items