Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Weijian, Chen, Hong-Yu, Rehemtulla, Nabeel, Shah, Ved G., Wu, Dennis, Kim, Dongho, Lin, Qinjie, Miller, Adam A., Liu, Han
Format:	Preprint
Published:	2025
Subjects:	Solar and Stellar Astrophysics Instrumentation and Methods for Astrophysics Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.06200
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908838652805120
author	Li, Weijian Chen, Hong-Yu Rehemtulla, Nabeel Shah, Ved G. Wu, Dennis Kim, Dongho Lin, Qinjie Miller, Adam A. Liu, Han
author_facet	Li, Weijian Chen, Hong-Yu Rehemtulla, Nabeel Shah, Ved G. Wu, Dennis Kim, Dongho Lin, Qinjie Miller, Adam A. Liu, Han
contents	Time series foundation models (TSFMs) are increasingly being adopted as highly-capable general-purpose time series representation learners. Although their training corpora are vast, they exclude astronomical time series data. Observations of stars produce peta-scale time series with unique challenges including irregular sampling and heteroskedasticity. We introduce StarEmbed, the first public benchmark for rigorous and standardized evaluation of state-of-the-art TSFMs on stellar time series observations (``light curves''). We benchmark on three scientifically-motivated downstream tasks: unsupervised clustering, supervised classification, and out-of-distribution source detection. StarEmbed integrates a catalog of expert-vetted labels with multi-variate light curves from the Zwicky Transient Facility, yielding ~40k hand-labeled light curves spread across seven astrophysical classes. We evaluate the zero-shot representation capabilities of three TSFMs (MOIRAI, Chronos, Chronos-Bolt) and a domain-specific transformer (Astromer) against handcrafted feature extraction, the long-standing baseline in the astrophysics literature. Our results demonstrate that these TSFMs, especially the Chronos models, which are trained on data completely unlike the astronomical observations, can outperform established astrophysics-specific baselines in some tasks and effectively generalize to entirely new data. In particular, TSFMs deliver state-of-the-art performance on our out-of-distribution source detection benchmark. With the first benchmark of TSFMs on astronomical time series data, we test the limits of their generalization and motivate a paradigm shift in time-domain astronomy from using task-specific, fully supervised pipelines toward adopting generic foundation model representations for the analysis of peta-scale datasets from forthcoming observatories.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_06200
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars Li, Weijian Chen, Hong-Yu Rehemtulla, Nabeel Shah, Ved G. Wu, Dennis Kim, Dongho Lin, Qinjie Miller, Adam A. Liu, Han Solar and Stellar Astrophysics Instrumentation and Methods for Astrophysics Artificial Intelligence Time series foundation models (TSFMs) are increasingly being adopted as highly-capable general-purpose time series representation learners. Although their training corpora are vast, they exclude astronomical time series data. Observations of stars produce peta-scale time series with unique challenges including irregular sampling and heteroskedasticity. We introduce StarEmbed, the first public benchmark for rigorous and standardized evaluation of state-of-the-art TSFMs on stellar time series observations (``light curves''). We benchmark on three scientifically-motivated downstream tasks: unsupervised clustering, supervised classification, and out-of-distribution source detection. StarEmbed integrates a catalog of expert-vetted labels with multi-variate light curves from the Zwicky Transient Facility, yielding ~40k hand-labeled light curves spread across seven astrophysical classes. We evaluate the zero-shot representation capabilities of three TSFMs (MOIRAI, Chronos, Chronos-Bolt) and a domain-specific transformer (Astromer) against handcrafted feature extraction, the long-standing baseline in the astrophysics literature. Our results demonstrate that these TSFMs, especially the Chronos models, which are trained on data completely unlike the astronomical observations, can outperform established astrophysics-specific baselines in some tasks and effectively generalize to entirely new data. In particular, TSFMs deliver state-of-the-art performance on our out-of-distribution source detection benchmark. With the first benchmark of TSFMs on astronomical time series data, we test the limits of their generalization and motivate a paradigm shift in time-domain astronomy from using task-specific, fully supervised pipelines toward adopting generic foundation model representations for the analysis of peta-scale datasets from forthcoming observatories.
title	StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars
topic	Solar and Stellar Astrophysics Instrumentation and Methods for Astrophysics Artificial Intelligence
url	https://arxiv.org/abs/2510.06200

Similar Items