Saved in:
Bibliographic Details
Main Authors: Cheikhi, David, Russo, Daniel
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.07136
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913261925957632
author Cheikhi, David
Russo, Daniel
author_facet Cheikhi, David
Russo, Daniel
contents Identifying the trade-offs between model-based and model-free methods is a central question in reinforcement learning. Value-based methods offer substantial computational advantages and are sometimes just as statistically efficient as model-based methods. However, focusing on the core problem of policy evaluation, we show information about the transition dynamics may be impossible to represent in the space of value functions. We explore this through a series of case studies focused on structures that arises in many important problems. In several, there is no information loss and value-based methods are as statistically efficient as model based ones. In other closely-related examples, information loss is severe and value-based methods are severely outperformed. A deeper investigation points to the limitations of the representational power as the driver of the inefficiency, as opposed to failure in algorithm design.
format Preprint
id arxiv_https___arxiv_org_abs_2403_07136
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle On the Limited Representational Power of Value Functions and its Links to Statistical (In)Efficiency
Cheikhi, David
Russo, Daniel
Machine Learning
Artificial Intelligence
Identifying the trade-offs between model-based and model-free methods is a central question in reinforcement learning. Value-based methods offer substantial computational advantages and are sometimes just as statistically efficient as model-based methods. However, focusing on the core problem of policy evaluation, we show information about the transition dynamics may be impossible to represent in the space of value functions. We explore this through a series of case studies focused on structures that arises in many important problems. In several, there is no information loss and value-based methods are as statistically efficient as model based ones. In other closely-related examples, information loss is severe and value-based methods are severely outperformed. A deeper investigation points to the limitations of the representational power as the driver of the inefficiency, as opposed to failure in algorithm design.
title On the Limited Representational Power of Value Functions and its Links to Statistical (In)Efficiency
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2403.07136