Saved in:
Bibliographic Details
Main Authors: Aubert, Julien, Köhler, Louis, Lehéricy, Luc, Mezzadri, Giulia, Reynaud-Bouret, Patricia
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.13186
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912237701038080
author Aubert, Julien
Köhler, Louis
Lehéricy, Luc
Mezzadri, Giulia
Reynaud-Bouret, Patricia
author_facet Aubert, Julien
Köhler, Louis
Lehéricy, Luc
Mezzadri, Giulia
Reynaud-Bouret, Patricia
contents Learning for animals or humans is the process that leads to behaviors better adapted to the environment. This process highly depends on the individual that learns and is usually observed only through the individual's actions. This article presents ways to use this individual behavioral data to find the model that best explains how the individual learns. We propose two model selection methods: a general hold-out procedure and an AIC-type criterion, both adapted to non-stationary dependent data. We provide theoretical error bounds for these methods that are close to those of the standard i.i.d. case. To compare these approaches, we apply them to contextual bandit models and illustrate their use on both synthetic and experimental learning data in a human categorization task.
format Preprint
id arxiv_https___arxiv_org_abs_2502_13186
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Model selection for behavioral learning data and applications to contextual bandits
Aubert, Julien
Köhler, Louis
Lehéricy, Luc
Mezzadri, Giulia
Reynaud-Bouret, Patricia
Machine Learning
Learning for animals or humans is the process that leads to behaviors better adapted to the environment. This process highly depends on the individual that learns and is usually observed only through the individual's actions. This article presents ways to use this individual behavioral data to find the model that best explains how the individual learns. We propose two model selection methods: a general hold-out procedure and an AIC-type criterion, both adapted to non-stationary dependent data. We provide theoretical error bounds for these methods that are close to those of the standard i.i.d. case. To compare these approaches, we apply them to contextual bandit models and illustrate their use on both synthetic and experimental learning data in a human categorization task.
title Model selection for behavioral learning data and applications to contextual bandits
topic Machine Learning
url https://arxiv.org/abs/2502.13186