_version_ 1866918144699793408
author Saeed, Numan
Hassan, Salma
Hardan, Shahad
Aly, Ahmed
Taratynova, Darya
Nawaz, Umair
Khan, Ufaq
Ridzuan, Muhammad
Andrearczyk, Vincent
Depeursinge, Adrien
Xie, Yutong
Eugene, Thomas
Metz, Raphaël
Dore, Mélanie
Delpon, Gregory
Papineni, Vijay Ram Kumar
Wahid, Kareem
Dede, Cem
Ali, Alaa Mohamed Shawky
Sjogreen, Carlos
Naser, Mohamed
Fuller, Clifton D.
Oreiller, Valentin
Jreige, Mario
Prior, John O.
Rest, Catherine Cheze Le
Tankyevych, Olena
Decazes, Pierre
Ruan, Su
Tanadini-Lang, Stephanie
Vallières, Martin
Elhalawani, Hesham
Abgral, Ronan
Floch, Romain
Kerleguer, Kevin
Schick, Ulrike
Mauguen, Maelle
Bourhis, David
Leclere, Jean-Christophe
Sambourg, Amandine
Rahmim, Arman
Hatt, Mathieu
Yaqub, Mohammad
author_facet Saeed, Numan
Hassan, Salma
Hardan, Shahad
Aly, Ahmed
Taratynova, Darya
Nawaz, Umair
Khan, Ufaq
Ridzuan, Muhammad
Andrearczyk, Vincent
Depeursinge, Adrien
Xie, Yutong
Eugene, Thomas
Metz, Raphaël
Dore, Mélanie
Delpon, Gregory
Papineni, Vijay Ram Kumar
Wahid, Kareem
Dede, Cem
Ali, Alaa Mohamed Shawky
Sjogreen, Carlos
Naser, Mohamed
Fuller, Clifton D.
Oreiller, Valentin
Jreige, Mario
Prior, John O.
Rest, Catherine Cheze Le
Tankyevych, Olena
Decazes, Pierre
Ruan, Su
Tanadini-Lang, Stephanie
Vallières, Martin
Elhalawani, Hesham
Abgral, Ronan
Floch, Romain
Kerleguer, Kevin
Schick, Ulrike
Mauguen, Maelle
Bourhis, David
Leclere, Jean-Christophe
Sambourg, Amandine
Rahmim, Arman
Hatt, Mathieu
Yaqub, Mohammad
contents We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity from a long-term, multi-institution retrospective collection. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following established guidelines. We provide anonymized NifTi files, expert-annotated segmentation masks, comprehensive clinical metadata, and radiotherapy dose distributions for a patient subset. The metadata include TNM staging, HPV status, demographics, long-term follow-up outcomes, survival times, censoring indicators, and treatment information. To demonstrate its utility, we benchmark three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, using state-of-the-art deep learning models like UNet, SegResNet, and multimodal prognostic frameworks.
format Preprint
id arxiv_https___arxiv_org_abs_2509_00367
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Multimodal and Multi-centric Head and Neck Cancer Dataset for Segmentation, Diagnosis and Outcome Prediction
Saeed, Numan
Hassan, Salma
Hardan, Shahad
Aly, Ahmed
Taratynova, Darya
Nawaz, Umair
Khan, Ufaq
Ridzuan, Muhammad
Andrearczyk, Vincent
Depeursinge, Adrien
Xie, Yutong
Eugene, Thomas
Metz, Raphaël
Dore, Mélanie
Delpon, Gregory
Papineni, Vijay Ram Kumar
Wahid, Kareem
Dede, Cem
Ali, Alaa Mohamed Shawky
Sjogreen, Carlos
Naser, Mohamed
Fuller, Clifton D.
Oreiller, Valentin
Jreige, Mario
Prior, John O.
Rest, Catherine Cheze Le
Tankyevych, Olena
Decazes, Pierre
Ruan, Su
Tanadini-Lang, Stephanie
Vallières, Martin
Elhalawani, Hesham
Abgral, Ronan
Floch, Romain
Kerleguer, Kevin
Schick, Ulrike
Mauguen, Maelle
Bourhis, David
Leclere, Jean-Christophe
Sambourg, Amandine
Rahmim, Arman
Hatt, Mathieu
Yaqub, Mohammad
Computer Vision and Pattern Recognition
We present a publicly available multimodal dataset for head and neck cancer research, comprising 1123 annotated Positron Emission Tomography/Computed Tomography (PET/CT) studies from patients with histologically confirmed disease, acquired from 10 international medical centers. All studies contain co-registered PET/CT scans with varying acquisition protocols, reflecting real-world clinical diversity from a long-term, multi-institution retrospective collection. Primary gross tumor volumes (GTVp) and involved lymph nodes (GTVn) were manually segmented by experienced radiation oncologists and radiologists following established guidelines. We provide anonymized NifTi files, expert-annotated segmentation masks, comprehensive clinical metadata, and radiotherapy dose distributions for a patient subset. The metadata include TNM staging, HPV status, demographics, long-term follow-up outcomes, survival times, censoring indicators, and treatment information. To demonstrate its utility, we benchmark three key clinical tasks: automated tumor segmentation, recurrence-free survival prediction, and HPV status classification, using state-of-the-art deep learning models like UNet, SegResNet, and multimodal prognostic frameworks.
title A Multimodal and Multi-centric Head and Neck Cancer Dataset for Segmentation, Diagnosis and Outcome Prediction
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2509.00367