Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Siguang, Lv, Chunli, Xie, Miao
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2601.12945
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911494876168192
author	Chen, Siguang Lv, Chunli Xie, Miao
author_facet	Chen, Siguang Lv, Chunli Xie, Miao
contents	Large language models (LLMs) have become powerful and widely used systems for language understanding and generation, while multi-armed bandit (MAB) algorithms provide a principled framework for adaptive decision-making under uncertainty. This survey explores the potential at the intersection of these two fields. As we know, it is the first survey to systematically review the bidirectional interaction between large language models and multi-armed bandits at the component level. We highlight the bidirectional benefits: MAB algorithms address critical LLM challenges, spanning from pre-training to retrieval-augmented generation (RAG) and personalization. Conversely, LLMs enhance MAB systems by redefining core components such as arm definition and environment modeling, thereby improving decision-making in sequential tasks. We analyze existing LLM-enhanced bandit systems and bandit-enhanced LLM systems, providing insights into their design, methodologies, and performance. Key challenges and representative findings are identified to help guide future research. An accompanying GitHub repository that indexes relevant literature is available at https://github.com/bucky1119/Awesome-LLM-Bandit-Interaction.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_12945
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits Chen, Siguang Lv, Chunli Xie, Miao Computation and Language Machine Learning Large language models (LLMs) have become powerful and widely used systems for language understanding and generation, while multi-armed bandit (MAB) algorithms provide a principled framework for adaptive decision-making under uncertainty. This survey explores the potential at the intersection of these two fields. As we know, it is the first survey to systematically review the bidirectional interaction between large language models and multi-armed bandits at the component level. We highlight the bidirectional benefits: MAB algorithms address critical LLM challenges, spanning from pre-training to retrieval-augmented generation (RAG) and personalization. Conversely, LLMs enhance MAB systems by redefining core components such as arm definition and environment modeling, thereby improving decision-making in sequential tasks. We analyze existing LLM-enhanced bandit systems and bandit-enhanced LLM systems, providing insights into their design, methodologies, and performance. Key challenges and representative findings are identified to help guide future research. An accompanying GitHub repository that indexes relevant literature is available at https://github.com/bucky1119/Awesome-LLM-Bandit-Interaction.
title	A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2601.12945

Similar Items