Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shi, Chongyang, Han, Shuo, Dorothy, Michael, Fu, Jie
Format:	Preprint
Published:	2024
Subjects:	Systems and Control
Online Access:	https://arxiv.org/abs/2409.16439
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914956787580928
author	Shi, Chongyang Han, Shuo Dorothy, Michael Fu, Jie
author_facet	Shi, Chongyang Han, Shuo Dorothy, Michael Fu, Jie
contents	This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_16439
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Active Perception with Initial-State Uncertainty: A Policy Gradient Method Shi, Chongyang Han, Shuo Dorothy, Michael Fu, Jie Systems and Control This paper studies the synthesis of an active perception policy that maximizes the information leakage of the initial state in a stochastic system modeled as a hidden Markov model (HMM). Specifically, the emission function of the HMM is controllable with a set of perception or sensor query actions. Given the goal is to infer the initial state from partial observations in the HMM, we use Shannon conditional entropy as the planning objective and develop a novel policy gradient method with convergence guarantees. By leveraging a variant of observable operators in HMMs, we prove several important properties of the gradient of the conditional entropy with respect to the policy parameters, which allow efficient computation of the policy gradient and stable and fast convergence. We demonstrate the effectiveness of our solution by applying it to an inference problem in a stochastic grid world environment.
title	Active Perception with Initial-State Uncertainty: A Policy Gradient Method
topic	Systems and Control
url	https://arxiv.org/abs/2409.16439

Similar Items