Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Young, Sam, Jwa, Yeon-jae, Terao, Kazuhiro
Format:	Preprint
Publié:	2025
Sujets:	High Energy Physics - Experiment Computer Vision and Pattern Recognition Machine Learning
Accès en ligne:	https://arxiv.org/abs/2502.02558
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914385325195264
author	Young, Sam Jwa, Yeon-jae Terao, Kazuhiro
author_facet	Young, Sam Jwa, Yeon-jae Terao, Kazuhiro
contents	Effective self-supervised learning (SSL) techniques have been key to unlocking large datasets for representation learning. While many promising methods have been developed using online corpora and captioned photographs, their application to scientific domains, where data encodes highly specialized knowledge, remains a challenge. Liquid Argon Time Projection Chambers (LArTPCs) provide high-resolution 3D imaging for fundamental physics, but analysis of their sparse, complex point cloud data often relies on supervised methods trained on large simulations, introducing potential biases. We introduce the Point-based Liquid Argon Masked Autoencoder (PoLAr-MAE), applying masked point modeling to unlabeled LArTPC images using domain-specific volumetric tokenization and energy prediction. We show this SSL approach learns physically meaningful trajectory representations directly from data. This yields remarkable data efficiency: fine-tuning on just 100 labeled events achieves track/shower semantic segmentation performance comparable to the state-of-the-art supervised baseline trained on $>$100,000 events. Furthermore, internal attention maps exhibit emergent instance segmentation of particle trajectories. While challenges remain, particularly for fine-grained features, we make concrete SSL's potential for building a foundation model for LArTPC image analysis capable of serving as a common base for all data reconstruction tasks. To facilitate further progress, we release PILArNet-M, a large dataset of 1M LArTPC events. Project site: https://youngsm.com/polarmae.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_02558
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Particle Trajectory Representation Learning with Masked Point Modeling Young, Sam Jwa, Yeon-jae Terao, Kazuhiro High Energy Physics - Experiment Computer Vision and Pattern Recognition Machine Learning Effective self-supervised learning (SSL) techniques have been key to unlocking large datasets for representation learning. While many promising methods have been developed using online corpora and captioned photographs, their application to scientific domains, where data encodes highly specialized knowledge, remains a challenge. Liquid Argon Time Projection Chambers (LArTPCs) provide high-resolution 3D imaging for fundamental physics, but analysis of their sparse, complex point cloud data often relies on supervised methods trained on large simulations, introducing potential biases. We introduce the Point-based Liquid Argon Masked Autoencoder (PoLAr-MAE), applying masked point modeling to unlabeled LArTPC images using domain-specific volumetric tokenization and energy prediction. We show this SSL approach learns physically meaningful trajectory representations directly from data. This yields remarkable data efficiency: fine-tuning on just 100 labeled events achieves track/shower semantic segmentation performance comparable to the state-of-the-art supervised baseline trained on $>$100,000 events. Furthermore, internal attention maps exhibit emergent instance segmentation of particle trajectories. While challenges remain, particularly for fine-grained features, we make concrete SSL's potential for building a foundation model for LArTPC image analysis capable of serving as a common base for all data reconstruction tasks. To facilitate further progress, we release PILArNet-M, a large dataset of 1M LArTPC events. Project site: https://youngsm.com/polarmae.
title	Particle Trajectory Representation Learning with Masked Point Modeling
topic	High Energy Physics - Experiment Computer Vision and Pattern Recognition Machine Learning
url	https://arxiv.org/abs/2502.02558

Documents similaires