Saved in:
Bibliographic Details
Main Authors: Chen, Wei, Wu, Liang, Lu, Shuyi, Sun, Yuanyuan, Bi, Wenkai, Yuan, Zilong, He, Yaoyao, Wang, Feng, Ma, Junchi, Liu, Shuyong, Cheng, Zhaoping, Hu, Xiaoyan, Qiu, Jianfeng
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.12820
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914263380000768
author Chen, Wei
Wu, Liang
Lu, Shuyi
Sun, Yuanyuan
Bi, Wenkai
Yuan, Zilong
He, Yaoyao
Wang, Feng
Ma, Junchi
Liu, Shuyong
Cheng, Zhaoping
Hu, Xiaoyan
Qiu, Jianfeng
author_facet Chen, Wei
Wu, Liang
Lu, Shuyi
Sun, Yuanyuan
Bi, Wenkai
Yuan, Zilong
He, Yaoyao
Wang, Feng
Ma, Junchi
Liu, Shuyong
Cheng, Zhaoping
Hu, Xiaoyan
Qiu, Jianfeng
contents Total-body PET/CT enables system-wide molecular imaging, but heterogeneous anatomical and metabolic signals, approximately 2 m axial coverage, and structured radiology semantics challenge existing medical AI models that assume single-modality inputs, localized fields of view, and coarse image-text alignment. We introduce SDF-HOLO (Systemic Dual-stream Fusion Holo Model), a multimodal foundation model for holistic total-body PET/CT, pre-trained on more than 10,000 patients. SDF-HOLO decouples CT and PET representation learning with dual-stream encoders and couples them through a cross-modal interaction module, allowing anatomical context to refine PET aggregation while metabolic saliency guides subtle morphological reasoning. To model long-range dependencies across the body, hierarchical context modeling combines efficient local windows with global attention. To bridge voxels and clinical language, we use anatomical segmentation masks as explicit semantic anchors and perform voxel-mask-text alignment during pre-training. Across tumor segmentation, low-dose lesion detection, and multilingual diagnostic report generation, SDF-HOLO outperforms strong task-specific and clinical-reference baselines while reducing localization errors and hallucinated findings. Beyond focal interpretation, the model enables system-wide metabolic profiling and reveals tumor-associated fingerprints of inter-organ metabolic network interactions, providing a scalable computational foundation for total-body PET/CT diagnostics and system-level precision oncology.
format Preprint
id arxiv_https___arxiv_org_abs_2601_12820
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A Generalist Foundation Model for Total-body PET/CT Enables Diagnostic Reporting and System-wide Metabolic Profiling
Chen, Wei
Wu, Liang
Lu, Shuyi
Sun, Yuanyuan
Bi, Wenkai
Yuan, Zilong
He, Yaoyao
Wang, Feng
Ma, Junchi
Liu, Shuyong
Cheng, Zhaoping
Hu, Xiaoyan
Qiu, Jianfeng
Computer Vision and Pattern Recognition
Total-body PET/CT enables system-wide molecular imaging, but heterogeneous anatomical and metabolic signals, approximately 2 m axial coverage, and structured radiology semantics challenge existing medical AI models that assume single-modality inputs, localized fields of view, and coarse image-text alignment. We introduce SDF-HOLO (Systemic Dual-stream Fusion Holo Model), a multimodal foundation model for holistic total-body PET/CT, pre-trained on more than 10,000 patients. SDF-HOLO decouples CT and PET representation learning with dual-stream encoders and couples them through a cross-modal interaction module, allowing anatomical context to refine PET aggregation while metabolic saliency guides subtle morphological reasoning. To model long-range dependencies across the body, hierarchical context modeling combines efficient local windows with global attention. To bridge voxels and clinical language, we use anatomical segmentation masks as explicit semantic anchors and perform voxel-mask-text alignment during pre-training. Across tumor segmentation, low-dose lesion detection, and multilingual diagnostic report generation, SDF-HOLO outperforms strong task-specific and clinical-reference baselines while reducing localization errors and hallucinated findings. Beyond focal interpretation, the model enables system-wide metabolic profiling and reveals tumor-associated fingerprints of inter-organ metabolic network interactions, providing a scalable computational foundation for total-body PET/CT diagnostics and system-level precision oncology.
title A Generalist Foundation Model for Total-body PET/CT Enables Diagnostic Reporting and System-wide Metabolic Profiling
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2601.12820