Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hanley, Margot, Yeh, Jiunn-Tyng, Rodriguez, Ryan, Pilkington, Jack, Farahany, Nita
Format:	Preprint
Published:	2026
Subjects:	Computers and Society Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.02511
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911417568854016
author	Hanley, Margot Yeh, Jiunn-Tyng Rodriguez, Ryan Pilkington, Jack Farahany, Nita
author_facet	Hanley, Margot Yeh, Jiunn-Tyng Rodriguez, Ryan Pilkington, Jack Farahany, Nita
contents	Brain foundation models bring the foundation model paradigm to the field of neuroscience. Like language and image foundation models, they are general-purpose AI systems pretrained on large-scale datasets that adapt readily to downstream tasks. Unlike text-and-image based models, however, they train on brain data: large-datasets of EEG, fMRI, and other neural data types historically collected within tightly governed clinical and research settings. This paper contends that training foundation models on neural data opens new normative territory. Neural data carry stronger expectations of, and claims to, protection than text or images, given their body-derived nature and historical governance within clinical and research settings. Yet the foundation model paradigm subjects them to practices of large-scale repurposing, cross-context stitching, and open-ended downstream application. Furthermore, these practices are now accessible to a much broader range of actors, including commercial developers, against a backdrop of fragmented and unclear governance. To map this territory, we first describe brain foundation models' technical foundations and training-data ecosystem. We then draw on AI ethics, neuroethics, and bioethics to organize concerns across privacy, consent, bias, benefit sharing, and governance. For each, we propose both agenda-setting questions and baseline safeguards as the field matures.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_02511
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Training Data Governance for Brain Foundation Models Hanley, Margot Yeh, Jiunn-Tyng Rodriguez, Ryan Pilkington, Jack Farahany, Nita Computers and Society Artificial Intelligence Brain foundation models bring the foundation model paradigm to the field of neuroscience. Like language and image foundation models, they are general-purpose AI systems pretrained on large-scale datasets that adapt readily to downstream tasks. Unlike text-and-image based models, however, they train on brain data: large-datasets of EEG, fMRI, and other neural data types historically collected within tightly governed clinical and research settings. This paper contends that training foundation models on neural data opens new normative territory. Neural data carry stronger expectations of, and claims to, protection than text or images, given their body-derived nature and historical governance within clinical and research settings. Yet the foundation model paradigm subjects them to practices of large-scale repurposing, cross-context stitching, and open-ended downstream application. Furthermore, these practices are now accessible to a much broader range of actors, including commercial developers, against a backdrop of fragmented and unclear governance. To map this territory, we first describe brain foundation models' technical foundations and training-data ecosystem. We then draw on AI ethics, neuroethics, and bioethics to organize concerns across privacy, consent, bias, benefit sharing, and governance. For each, we propose both agenda-setting questions and baseline safeguards as the field matures.
title	Training Data Governance for Brain Foundation Models
topic	Computers and Society Artificial Intelligence
url	https://arxiv.org/abs/2602.02511

Similar Items