Saved in:
Bibliographic Details
Main Authors: Chaptoukaev, Hava, Marcianó, Vincenzo, Galati, Francesco, Zuluaga, Maria A.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.20768
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929442775891968
author Chaptoukaev, Hava
Marcianó, Vincenzo
Galati, Francesco
Zuluaga, Maria A.
author_facet Chaptoukaev, Hava
Marcianó, Vincenzo
Galati, Francesco
Zuluaga, Maria A.
contents Combining multiple modalities carrying complementary information through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is often overlooked. Most works assume modality completeness in the input data, while in clinical practice, it is common to have incomplete modalities. Existing solutions that address this issue rely on modality imputation strategies before using supervised learning models. These strategies, however, are complex, computationally costly and can strongly impact subsequent prediction models. Hence, they should be used with parsimony in sensitive applications such as healthcare. We propose HyperMM, an end-to-end framework designed for learning with varying-sized inputs. Specifically, we focus on the task of supervised MML with missing imaging modalities without using imputation before training. We introduce a novel strategy for training a universal feature extractor using a conditional hypernetwork, and propose a permutation-invariant neural network that can handle inputs of varying dimensions to process the extracted features, in a two-phase task-agnostic framework. We experimentally demonstrate the advantages of our method in two tasks: Alzheimer's disease detection and breast cancer classification. We demonstrate that our strategy is robust to high rates of missing data and that its flexibility allows it to handle varying-sized datasets beyond the scenario of missing modalities.
format Preprint
id arxiv_https___arxiv_org_abs_2407_20768
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle HyperMM : Robust Multimodal Learning with Varying-sized Inputs
Chaptoukaev, Hava
Marcianó, Vincenzo
Galati, Francesco
Zuluaga, Maria A.
Machine Learning
Combining multiple modalities carrying complementary information through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is often overlooked. Most works assume modality completeness in the input data, while in clinical practice, it is common to have incomplete modalities. Existing solutions that address this issue rely on modality imputation strategies before using supervised learning models. These strategies, however, are complex, computationally costly and can strongly impact subsequent prediction models. Hence, they should be used with parsimony in sensitive applications such as healthcare. We propose HyperMM, an end-to-end framework designed for learning with varying-sized inputs. Specifically, we focus on the task of supervised MML with missing imaging modalities without using imputation before training. We introduce a novel strategy for training a universal feature extractor using a conditional hypernetwork, and propose a permutation-invariant neural network that can handle inputs of varying dimensions to process the extracted features, in a two-phase task-agnostic framework. We experimentally demonstrate the advantages of our method in two tasks: Alzheimer's disease detection and breast cancer classification. We demonstrate that our strategy is robust to high rates of missing data and that its flexibility allows it to handle varying-sized datasets beyond the scenario of missing modalities.
title HyperMM : Robust Multimodal Learning with Varying-sized Inputs
topic Machine Learning
url https://arxiv.org/abs/2407.20768