Bewaard in:
Bibliografische gegevens
Hoofdauteurs: Alwazzan, Omnia, Patras, Ioannis, Slabaugh, Gregory
Formaat: Preprint
Gepubliceerd in: 2024
Onderwerpen:
Online toegang:https://arxiv.org/abs/2403.06339
Tags: Voeg label toe
Geen labels, Wees de eerste die dit record labelt!
_version_ 1866929271118757888
author Alwazzan, Omnia
Patras, Ioannis
Slabaugh, Gregory
author_facet Alwazzan, Omnia
Patras, Ioannis
Slabaugh, Gregory
contents Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation. This paper proposes a simple and effective approach, inspired by attention, to fuse discriminative features from different modalities. We propose a novel attention mechanism, called Flattened Outer Arithmetic Attention (FOAA), which relies on outer arithmetic operators (addition, subtraction, product, and division) to compute attention scores from keys, queries and values derived from flattened embeddings of each modality. We demonstrate how FOAA can be implemented for self-attention and cross-attention, providing a reusable component in neural network architectures. We evaluate FOAA on two datasets for multimodal tumor classification and achieve state-of-the-art results, and we demonstrate that features enriched by FOAA are superior to those derived from other fusion approaches. The code is publicly available at \href{https://github.com/omniaalwazzan/FOAA}{https://github.com/omniaalwazzan/FOAA}
format Preprint
id arxiv_https___arxiv_org_abs_2403_06339
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification
Alwazzan, Omnia
Patras, Ioannis
Slabaugh, Gregory
Computer Vision and Pattern Recognition
Fusion of multimodal healthcare data holds great promise to provide a holistic view of a patient's health, taking advantage of the complementarity of different modalities while leveraging their correlation. This paper proposes a simple and effective approach, inspired by attention, to fuse discriminative features from different modalities. We propose a novel attention mechanism, called Flattened Outer Arithmetic Attention (FOAA), which relies on outer arithmetic operators (addition, subtraction, product, and division) to compute attention scores from keys, queries and values derived from flattened embeddings of each modality. We demonstrate how FOAA can be implemented for self-attention and cross-attention, providing a reusable component in neural network architectures. We evaluate FOAA on two datasets for multimodal tumor classification and achieve state-of-the-art results, and we demonstrate that features enriched by FOAA are superior to those derived from other fusion approaches. The code is publicly available at \href{https://github.com/omniaalwazzan/FOAA}{https://github.com/omniaalwazzan/FOAA}
title FOAA: Flattened Outer Arithmetic Attention For Multimodal Tumor Classification
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2403.06339