Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Zhao, Yunpeng, Chen, Cheng, Pang, Qing You, Li, Quanzheng, Tang, Carol, Ang, Beng-Ti, Jin, Yueming
Format:	Preprint
Publié:	2024
Sujets:	Computer Vision and Pattern Recognition
Accès en ligne:	https://arxiv.org/abs/2406.01987
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866909215974490112
author	Zhao, Yunpeng Chen, Cheng Pang, Qing You Li, Quanzheng Tang, Carol Ang, Beng-Ti Jin, Yueming
author_facet	Zhao, Yunpeng Chen, Cheng Pang, Qing You Li, Quanzheng Tang, Carol Ang, Beng-Ti Jin, Yueming
contents	Addressing missing modalities presents a critical challenge in multimodal learning. Current approaches focus on developing models that can handle modality-incomplete inputs during inference, assuming that the full set of modalities are available for all the data during training. This reliance on full-modality data for training limits the use of abundant modality-incomplete samples that are often encountered in practical settings. In this paper, we propose a robust universal model with modality reconstruction and model personalization, which can effectively tackle the missing modality at both training and testing stages. Our method leverages a multimodal masked autoencoder to reconstruct the missing modality and masked patches simultaneously, incorporating an innovative distribution approximation mechanism to fully utilize both modality-complete and modality-incomplete data. The reconstructed modalities then contributes to our designed data-model co-distillation scheme to guide the model learning in the presence of missing modalities. Moreover, we propose a CLIP-driven hyper-network to personalize partial model parameters, enabling the model to adapt to each distinct missing modality scenario. Our method has been extensively validated on two brain tumor segmentation benchmarks. Experimental results demonstrate the promising performance of our method, which consistently exceeds previous state-of-the-art approaches under the all-stage missing modality settings with different missing ratios. Code will be available.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_01987
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization Zhao, Yunpeng Chen, Cheng Pang, Qing You Li, Quanzheng Tang, Carol Ang, Beng-Ti Jin, Yueming Computer Vision and Pattern Recognition Addressing missing modalities presents a critical challenge in multimodal learning. Current approaches focus on developing models that can handle modality-incomplete inputs during inference, assuming that the full set of modalities are available for all the data during training. This reliance on full-modality data for training limits the use of abundant modality-incomplete samples that are often encountered in practical settings. In this paper, we propose a robust universal model with modality reconstruction and model personalization, which can effectively tackle the missing modality at both training and testing stages. Our method leverages a multimodal masked autoencoder to reconstruct the missing modality and masked patches simultaneously, incorporating an innovative distribution approximation mechanism to fully utilize both modality-complete and modality-incomplete data. The reconstructed modalities then contributes to our designed data-model co-distillation scheme to guide the model learning in the presence of missing modalities. Moreover, we propose a CLIP-driven hyper-network to personalize partial model parameters, enabling the model to adapt to each distinct missing modality scenario. Our method has been extensively validated on two brain tumor segmentation benchmarks. Experimental results demonstrate the promising performance of our method, which consistently exceeds previous state-of-the-art approaches under the all-stage missing modality settings with different missing ratios. Code will be available.
title	Dealing with All-stage Missing Modality: Towards A Universal Model with Robust Reconstruction and Personalization
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.01987

Documents similaires