MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Xu, Luyue, Wang, Liming, Xie, Hong, Zhou, Mingqiang
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Machine Learning Artificial Intelligence Information Retrieval
Accesso online:	https://arxiv.org/abs/2408.14432
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866913483026595840
author	Xu, Luyue Wang, Liming Xie, Hong Zhou, Mingqiang
author_facet	Xu, Luyue Wang, Liming Xie, Hong Zhou, Mingqiang
contents	Contextual bandits serve as a fundamental algorithmic framework for optimizing recommendation decisions online. Though extensive attention has been paid to tailoring contextual bandits for recommendation applications, the "herding effects" in user feedback have been ignored. These herding effects bias user feedback toward historical ratings, breaking down the assumption of unbiased feedback inherent in contextual bandits. This paper develops a novel variant of the contextual bandit that is tailored to address the feedback bias caused by the herding effects. A user feedback model is formulated to capture this feedback bias. We design the TS-Conf (Thompson Sampling under Conformity) algorithm, which employs posterior sampling to balance the exploration and exploitation tradeoff. We prove an upper bound for the regret of the algorithm, revealing the impact of herding effects on learning speed. Extensive experiments on datasets demonstrate that TS-Conf outperforms four benchmark algorithms. Analysis reveals that TS-Conf effectively mitigates the negative impact of herding effects, resulting in faster learning and improved recommendation accuracy.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_14432
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications Xu, Luyue Wang, Liming Xie, Hong Zhou, Mingqiang Machine Learning Artificial Intelligence Information Retrieval Contextual bandits serve as a fundamental algorithmic framework for optimizing recommendation decisions online. Though extensive attention has been paid to tailoring contextual bandits for recommendation applications, the "herding effects" in user feedback have been ignored. These herding effects bias user feedback toward historical ratings, breaking down the assumption of unbiased feedback inherent in contextual bandits. This paper develops a novel variant of the contextual bandit that is tailored to address the feedback bias caused by the herding effects. A user feedback model is formulated to capture this feedback bias. We design the TS-Conf (Thompson Sampling under Conformity) algorithm, which employs posterior sampling to balance the exploration and exploitation tradeoff. We prove an upper bound for the regret of the algorithm, revealing the impact of herding effects on learning speed. Extensive experiments on datasets demonstrate that TS-Conf outperforms four benchmark algorithms. Analysis reveals that TS-Conf effectively mitigates the negative impact of herding effects, resulting in faster learning and improved recommendation accuracy.
title	Contextual Bandit with Herding Effects: Algorithms and Recommendation Applications
topic	Machine Learning Artificial Intelligence Information Retrieval
url	https://arxiv.org/abs/2408.14432

Documenti analoghi