Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wesego, Daniel, Rooshenas, Pedram
Format:	Preprint
Published:	2023
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2305.15708
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929622143205376
author	Wesego, Daniel Rooshenas, Pedram
author_facet	Wesego, Daniel Rooshenas, Pedram
contents	Multimodal Variational Autoencoders (VAEs) represent a promising group of generative models that facilitate the construction of a tractable posterior within the latent space given multiple modalities. Previous studies have shown that as the number of modalities increases, the generative quality of each modality declines. In this study, we explore an alternative approach to enhance the generative performance of multimodal VAEs by jointly modeling the latent space of independently trained unimodal VAEs using score-based models (SBMs). The role of the SBM is to enforce multimodal coherence by learning the correlation among the latent variables. Consequently, our model combines a better generative quality of unimodal VAEs with coherent integration across different modalities using the latent score-based model. In addition, our approach provides the best unconditional coherence.
format	Preprint
id	arxiv_https___arxiv_org_abs_2305_15708
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Score-Based Multimodal Autoencoder Wesego, Daniel Rooshenas, Pedram Machine Learning Computer Vision and Pattern Recognition Multimodal Variational Autoencoders (VAEs) represent a promising group of generative models that facilitate the construction of a tractable posterior within the latent space given multiple modalities. Previous studies have shown that as the number of modalities increases, the generative quality of each modality declines. In this study, we explore an alternative approach to enhance the generative performance of multimodal VAEs by jointly modeling the latent space of independently trained unimodal VAEs using score-based models (SBMs). The role of the SBM is to enforce multimodal coherence by learning the correlation among the latent variables. Consequently, our model combines a better generative quality of unimodal VAEs with coherent integration across different modalities using the latent score-based model. In addition, our approach provides the best unconditional coherence.
title	Score-Based Multimodal Autoencoder
topic	Machine Learning Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2305.15708

Similar Items