Salvato in:
Dettagli Bibliografici
Autori principali: Lefebvre, Randy, Durand, Audrey
Natura: Preprint
Pubblicazione: 2024
Soggetti:
Accesso online:https://arxiv.org/abs/2407.15820
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866910833540333568
author Lefebvre, Randy
Durand, Audrey
author_facet Lefebvre, Randy
Durand, Audrey
contents Formulating a real-world problem under the Reinforcement Learning framework involves non-trivial design choices, such as selecting a discount factor for the learning objective (discounted cumulative rewards), which articulates the planning horizon of the agent. This work investigates the impact of the discount factor on the bias-variance trade-off given structural parameters of the underlying Markov Decision Process. Our results support the idea that a shorter planning horizon might be beneficial, especially under partial observability.
format Preprint
id arxiv_https___arxiv_org_abs_2407_15820
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle On shallow planning under partial observability
Lefebvre, Randy
Durand, Audrey
Artificial Intelligence
Formulating a real-world problem under the Reinforcement Learning framework involves non-trivial design choices, such as selecting a discount factor for the learning objective (discounted cumulative rewards), which articulates the planning horizon of the agent. This work investigates the impact of the discount factor on the bias-variance trade-off given structural parameters of the underlying Markov Decision Process. Our results support the idea that a shorter planning horizon might be beneficial, especially under partial observability.
title On shallow planning under partial observability
topic Artificial Intelligence
url https://arxiv.org/abs/2407.15820