Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yuan, Weixuan, Jin, Zengrui, Wang, Yichen, Xie, Donglin, Ye, Ziyi, Zhang, Chao, Chen, Xuesong
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Signal Processing
Online Access:	https://arxiv.org/abs/2602.13857
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911449530499072
author	Yuan, Weixuan Jin, Zengrui Wang, Yichen Xie, Donglin Ye, Ziyi Zhang, Chao Chen, Xuesong
author_facet	Yuan, Weixuan Jin, Zengrui Wang, Yichen Xie, Donglin Ye, Ziyi Zhang, Chao Chen, Xuesong
contents	Tasks ranging from sleep staging to clinical diagnosis traditionally rely on standard polysomnography (PSG) devices, bedside monitors and wearable devices, which capture diverse nocturnal biosignals (e.g., EEG, EOG, ECG, SpO$_2$). However, heterogeneity across devices and frequent sensor dropout pose significant challenges for unified modelling of these multimodal signals. We present \texttt{sleep2vec}, a foundation model for diverse and incomplete nocturnal biosignals that learns a shared representation via cross-modal alignment. \texttt{sleep2vec} is contrastively pre-trained on 42,249 overnight recordings spanning nine modalities using a \textit{Demography, Age, Site \& History-aware InfoNCE} objective that incorporates physiological and acquisition metadata (\textit{e.g.}, age, gender, recording site) to dynamically weight negatives and mitigate cohort-specific shortcuts. On downstream sleep staging and clinical outcome assessment, \texttt{sleep2vec} consistently outperforms strong baselines and remains robust to any subset of available modalities and sensor dropout. We further characterize, to our knowledge for the first time, scaling laws for nocturnal biosignals with respect to modality diversity and model capacity. Together, these results show that unified cross-modal alignment, coupled with principled scaling, enables label-efficient, general-purpose modelling of real-world nocturnal biosignals.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_13857
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals Yuan, Weixuan Jin, Zengrui Wang, Yichen Xie, Donglin Ye, Ziyi Zhang, Chao Chen, Xuesong Machine Learning Signal Processing Tasks ranging from sleep staging to clinical diagnosis traditionally rely on standard polysomnography (PSG) devices, bedside monitors and wearable devices, which capture diverse nocturnal biosignals (e.g., EEG, EOG, ECG, SpO$_2$). However, heterogeneity across devices and frequent sensor dropout pose significant challenges for unified modelling of these multimodal signals. We present \texttt{sleep2vec}, a foundation model for diverse and incomplete nocturnal biosignals that learns a shared representation via cross-modal alignment. \texttt{sleep2vec} is contrastively pre-trained on 42,249 overnight recordings spanning nine modalities using a \textit{Demography, Age, Site \& History-aware InfoNCE} objective that incorporates physiological and acquisition metadata (\textit{e.g.}, age, gender, recording site) to dynamically weight negatives and mitigate cohort-specific shortcuts. On downstream sleep staging and clinical outcome assessment, \texttt{sleep2vec} consistently outperforms strong baselines and remains robust to any subset of available modalities and sensor dropout. We further characterize, to our knowledge for the first time, scaling laws for nocturnal biosignals with respect to modality diversity and model capacity. Together, these results show that unified cross-modal alignment, coupled with principled scaling, enables label-efficient, general-purpose modelling of real-world nocturnal biosignals.
title	sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals
topic	Machine Learning Signal Processing
url	https://arxiv.org/abs/2602.13857

Similar Items