Saved in:
Bibliographic Details
Main Authors: Wan, Zishuo, Kang, Qinqin, Li, Na, Huang, Yi, Zhang, Qianru, Lu, Le, Bian, Yun, Ding, Dawei, Yan, Ke
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.04576
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917299508740096
author Wan, Zishuo
Kang, Qinqin
Li, Na
Huang, Yi
Zhang, Qianru
Lu, Le
Bian, Yun
Ding, Dawei
Yan, Ke
author_facet Wan, Zishuo
Kang, Qinqin
Li, Na
Huang, Yi
Zhang, Qianru
Lu, Le
Bian, Yun
Ding, Dawei
Yan, Ke
contents The accurate diagnosis and segmentation of tumors in contrast-enhanced Computed Tomography (CT) are fundamentally driven by the distinctive hemodynamic profiles of contrast agents over time. However, in real-world clinical practice, complete temporal dynamics are often hard to capture by strict radiation dose limits and inconsistent acquisition protocols across institutions, leading to a prevalent missing modality problem. Existing deep learning approaches typically treat missing phases as absent independent channels, ignoring the inherent temporal continuity of hemodynamics. In this work, we propose Time Attenuated Representation Disentanglement (TARDis), a novel physics-aware framework that redefines missing modalities as missing sample points on a continuous Time-Attenuation Curve. We first hypothesize that the latent feature can be disentangled into a time-invariant static component (anatomy) and a time-dependent dynamic component (perfusion). We achieve this via a dual-path architecture: a quantization-based path using a learnable embedding dictionary to extract consistent anatomical structures, and a probabilistic path using a Hemodynamic Conditional Variational Autoencoder to model dynamic enhancement conditioned on the estimated scan time. This design allows the network to infer missing hemodynamic features by sampling from the learned latent distribution. Extensive experiments on a large-scale multi-modal private abdominal CT dataset (2,282 patients) and two public datasets demonstrate that TARDis significantly outperforms state-of-the-art incomplete modality frameworks. Notably, our method maintains robust diagnostic performance even in extreme data-sparsity scenarios, highlighting its potential for reducing radiation exposure while maintaining diagnostic precision.
format Preprint
id arxiv_https___arxiv_org_abs_2512_04576
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification
Wan, Zishuo
Kang, Qinqin
Li, Na
Huang, Yi
Zhang, Qianru
Lu, Le
Bian, Yun
Ding, Dawei
Yan, Ke
Computer Vision and Pattern Recognition
The accurate diagnosis and segmentation of tumors in contrast-enhanced Computed Tomography (CT) are fundamentally driven by the distinctive hemodynamic profiles of contrast agents over time. However, in real-world clinical practice, complete temporal dynamics are often hard to capture by strict radiation dose limits and inconsistent acquisition protocols across institutions, leading to a prevalent missing modality problem. Existing deep learning approaches typically treat missing phases as absent independent channels, ignoring the inherent temporal continuity of hemodynamics. In this work, we propose Time Attenuated Representation Disentanglement (TARDis), a novel physics-aware framework that redefines missing modalities as missing sample points on a continuous Time-Attenuation Curve. We first hypothesize that the latent feature can be disentangled into a time-invariant static component (anatomy) and a time-dependent dynamic component (perfusion). We achieve this via a dual-path architecture: a quantization-based path using a learnable embedding dictionary to extract consistent anatomical structures, and a probabilistic path using a Hemodynamic Conditional Variational Autoencoder to model dynamic enhancement conditioned on the estimated scan time. This design allows the network to infer missing hemodynamic features by sampling from the learned latent distribution. Extensive experiments on a large-scale multi-modal private abdominal CT dataset (2,282 patients) and two public datasets demonstrate that TARDis significantly outperforms state-of-the-art incomplete modality frameworks. Notably, our method maintains robust diagnostic performance even in extreme data-sparsity scenarios, highlighting its potential for reducing radiation exposure while maintaining diagnostic precision.
title TARDis: Time Attenuated Representation Disentanglement for Incomplete Multi-Modal Tumor Segmentation and Classification
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2512.04576