Saved in:
Bibliographic Details
Main Authors: Shi, Chongyang, Han, Shuo, Dorothy, Michael, Fu, Jie
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.16439
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914956787580928
author Shi, Chongyang
Han, Shuo
Dorothy, Michael
Fu, Jie
author_facet Shi, Chongyang
Han, Shuo
Dorothy, Michael
Fu, Jie
contents This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
format Preprint
id arxiv_https___arxiv_org_abs_2409_16439
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Active Perception with Initial-State Uncertainty: A Policy Gradient Method
Shi, Chongyang
Han, Shuo
Dorothy, Michael
Fu, Jie
Systems and Control
This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
title Active Perception with Initial-State Uncertainty: A Policy Gradient Method
topic Systems and Control
url https://arxiv.org/abs/2409.16439