Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gu, Yuzhou, Han, Yanjun, Qian, Jian
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Information Theory
Online Access:	https://arxiv.org/abs/2503.00273
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918165259223040
author	Gu, Yuzhou Han, Yanjun Qian, Jian
author_facet	Gu, Yuzhou Han, Yanjun Qian, Jian
contents	We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we characterize the optimal success probability and mutual information over time. Our findings reveal distinct growth phases in mutual information -- initially linear, transitioning to quadratic, and finally returning to linear -- highlighting curious behavioral differences between interactive and non-interactive environments. In particular, we show that optimal success probability and mutual information can be decoupled, where achieving optimal learning does not necessarily require maximizing information gain. These findings shed new light on the intricate interplay between information and learning in interactive decision making.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_00273
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits Gu, Yuzhou Han, Yanjun Qian, Jian Machine Learning Information Theory We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we characterize the optimal success probability and mutual information over time. Our findings reveal distinct growth phases in mutual information -- initially linear, transitioning to quadratic, and finally returning to linear -- highlighting curious behavioral differences between interactive and non-interactive environments. In particular, we show that optimal success probability and mutual information can be decoupled, where achieving optimal learning does not necessarily require maximizing information gain. These findings shed new light on the intricate interplay between information and learning in interactive decision making.
title	Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits
topic	Machine Learning Information Theory
url	https://arxiv.org/abs/2503.00273

Similar Items