Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Aubert, Julien, Köhler, Louis, Lehéricy, Luc, Mezzadri, Giulia, Reynaud-Bouret, Patricia
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.13186
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912237701038080
author	Aubert, Julien Köhler, Louis Lehéricy, Luc Mezzadri, Giulia Reynaud-Bouret, Patricia
author_facet	Aubert, Julien Köhler, Louis Lehéricy, Luc Mezzadri, Giulia Reynaud-Bouret, Patricia
contents	Learning for animals or humans is the process that leads to behaviors better adapted to the environment. This process highly depends on the individual that learns and is usually observed only through the individual's actions. This article presents ways to use this individual behavioral data to find the model that best explains how the individual learns. We propose two model selection methods: a general hold-out procedure and an AIC-type criterion, both adapted to non-stationary dependent data. We provide theoretical error bounds for these methods that are close to those of the standard i.i.d. case. To compare these approaches, we apply them to contextual bandit models and illustrate their use on both synthetic and experimental learning data in a human categorization task.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_13186
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Model selection for behavioral learning data and applications to contextual bandits Aubert, Julien Köhler, Louis Lehéricy, Luc Mezzadri, Giulia Reynaud-Bouret, Patricia Machine Learning Learning for animals or humans is the process that leads to behaviors better adapted to the environment. This process highly depends on the individual that learns and is usually observed only through the individual's actions. This article presents ways to use this individual behavioral data to find the model that best explains how the individual learns. We propose two model selection methods: a general hold-out procedure and an AIC-type criterion, both adapted to non-stationary dependent data. We provide theoretical error bounds for these methods that are close to those of the standard i.i.d. case. To compare these approaches, we apply them to contextual bandit models and illustrate their use on both synthetic and experimental learning data in a human categorization task.
title	Model selection for behavioral learning data and applications to contextual bandits
topic	Machine Learning
url	https://arxiv.org/abs/2502.13186

Similar Items