Saved in:
Bibliographic Details
Main Authors: Bhargav, Jayanth, Ghasemi, Mahsa, Sundaram, Shreyas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.07310
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912244986544128
author Bhargav, Jayanth
Ghasemi, Mahsa
Sundaram, Shreyas
author_facet Bhargav, Jayanth
Ghasemi, Mahsa
Sundaram, Shreyas
contents Factored Markov Decision Processes (fMDPs) are a class of Markov Decision Processes (MDPs) in which the states (and actions) can be factored into a set of state (and action) variables and can be encoded compactly using a factored representation. In this paper, we consider a setting where the state of the fMDP is not directly observable, and the agent relies on a set of potential sensors to gather information. Each sensor has a selection cost and the designer must select a subset of sensors under a limited budget. We formulate the problem of selecting a set of sensors for fMDPs (under a budget) to maximize the infinite-horizon discounted return provided by the optimal policy. We show the fundamental result that it is NP-hard to approximate this problem to within any non-trivial factor. Our inapproximability results for optimal sensor selection also extend to a general class of Partially Observable MDPs (POMDPs). We then study the dual problem of budgeted actuator selection (at design-time) to maximize the expected return under the optimal policy. Again, we show that it is NP-hard to approximate this problem to within any non-trivial factor. Furthermore, with explicit examples, we show the failure of greedy algorithms for both the sensor and actuator selection problems and provide insights into the factors that cause these problems to be challenging. Despite this, through extensive simulations, we show the practical effectiveness and near-optimal performance of the greedy algorithm for actuator and sensor selection in many real-world and randomly generated instances.
format Preprint
id arxiv_https___arxiv_org_abs_2407_07310
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Optimal Sensor and Actuator Selection for Factored Markov Decision Processes: Complexity, Approximability and Algorithms
Bhargav, Jayanth
Ghasemi, Mahsa
Sundaram, Shreyas
Systems and Control
Computational Complexity
Optimization and Control
Factored Markov Decision Processes (fMDPs) are a class of Markov Decision Processes (MDPs) in which the states (and actions) can be factored into a set of state (and action) variables and can be encoded compactly using a factored representation. In this paper, we consider a setting where the state of the fMDP is not directly observable, and the agent relies on a set of potential sensors to gather information. Each sensor has a selection cost and the designer must select a subset of sensors under a limited budget. We formulate the problem of selecting a set of sensors for fMDPs (under a budget) to maximize the infinite-horizon discounted return provided by the optimal policy. We show the fundamental result that it is NP-hard to approximate this problem to within any non-trivial factor. Our inapproximability results for optimal sensor selection also extend to a general class of Partially Observable MDPs (POMDPs). We then study the dual problem of budgeted actuator selection (at design-time) to maximize the expected return under the optimal policy. Again, we show that it is NP-hard to approximate this problem to within any non-trivial factor. Furthermore, with explicit examples, we show the failure of greedy algorithms for both the sensor and actuator selection problems and provide insights into the factors that cause these problems to be challenging. Despite this, through extensive simulations, we show the practical effectiveness and near-optimal performance of the greedy algorithm for actuator and sensor selection in many real-world and randomly generated instances.
title Optimal Sensor and Actuator Selection for Factored Markov Decision Processes: Complexity, Approximability and Algorithms
topic Systems and Control
Computational Complexity
Optimization and Control
url https://arxiv.org/abs/2407.07310