Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pan, Yuqi, Zhao, Sadie, Tambe, Milind, Chen, Yiling
Format:	Preprint
Published:	2026
Subjects:	Computer Science and Game Theory
Online Access:	https://arxiv.org/abs/2605.15331
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913130642145280
author	Pan, Yuqi Zhao, Sadie Tambe, Milind Chen, Yiling
author_facet	Pan, Yuqi Zhao, Sadie Tambe, Milind Chen, Yiling
contents	We study a repeated information design setting in which the receiver, who is also the decision-maker, updates beliefs in a systematically biased way. More specifically, a distorted posterior in our model can be written as a convex combination of the prior and the Bayesian posterior, governed by a fixed but unknown parameter. Over repeated interactions, the sender chooses persuasive signaling schemes, observes only the receiver's realized actions, and seeks to minimize regret relative to a full-information oracle that knows the receiver's biased updating rule. We propose a safe exploration algorithm for learning the receiver's bias while maintaining high persuasion value. The algorithm exploits the asymmetric cost of probing: conservative probes incur only local loss, whereas overly aggressive probes may lose the persuasive opportunity entirely. For general finite state and action spaces and arbitrary bounded utilities, our method achieves $O(\log\log T)$ regret. A matching $Ω(\log\log T)$ lower bound shows that this rate is optimal. We further discuss the influence on receiver welfare, as well as extensions to jointly unknown prior and bias, and contextual settings with time-varying priors and utilities.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_15331
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Learning to Persuade a Biased Receiver Pan, Yuqi Zhao, Sadie Tambe, Milind Chen, Yiling Computer Science and Game Theory We study a repeated information design setting in which the receiver, who is also the decision-maker, updates beliefs in a systematically biased way. More specifically, a distorted posterior in our model can be written as a convex combination of the prior and the Bayesian posterior, governed by a fixed but unknown parameter. Over repeated interactions, the sender chooses persuasive signaling schemes, observes only the receiver's realized actions, and seeks to minimize regret relative to a full-information oracle that knows the receiver's biased updating rule. We propose a safe exploration algorithm for learning the receiver's bias while maintaining high persuasion value. The algorithm exploits the asymmetric cost of probing: conservative probes incur only local loss, whereas overly aggressive probes may lose the persuasive opportunity entirely. For general finite state and action spaces and arbitrary bounded utilities, our method achieves $O(\log\log T)$ regret. A matching $Ω(\log\log T)$ lower bound shows that this rate is optimal. We further discuss the influence on receiver welfare, as well as extensions to jointly unknown prior and bias, and contextual settings with time-varying priors and utilities.
title	Learning to Persuade a Biased Receiver
topic	Computer Science and Game Theory
url	https://arxiv.org/abs/2605.15331

Similar Items