Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Fernandes, Jose Geraldo, Nascimento, Sinval, Dominguete, Daniel, Oliveira, André, Rotsen, Lucas, Souza, Gabriel, Brochero, David, Facury, Luiz, Vilela, Mateus, Costa, Hebert, Coelho, Frederico, Braga, Antônio P.
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2407.17430
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909298811994112
author	Fernandes, Jose Geraldo Nascimento, Sinval Dominguete, Daniel Oliveira, André Rotsen, Lucas Souza, Gabriel Brochero, David Facury, Luiz Vilela, Mateus Costa, Hebert Coelho, Frederico Braga, Antônio P.
author_facet	Fernandes, Jose Geraldo Nascimento, Sinval Dominguete, Daniel Oliveira, André Rotsen, Lucas Souza, Gabriel Brochero, David Facury, Luiz Vilela, Mateus Costa, Hebert Coelho, Frederico Braga, Antônio P.
contents	In many applications, synchronizing audio with visuals is crucial, such as in creating graphic animations for films or games, translating movie audio into different languages, and developing metaverse applications. This review explores various methodologies for achieving realistic facial animations from audio inputs, highlighting generative and adaptive models. Addressing challenges like model training costs, dataset availability, and silent moment distributions in audio data, it presents innovative solutions to enhance performance and realism. The research also introduces a new taxonomy to categorize audio-visual synchronization methods based on logistical aspects, advancing the capabilities of virtual assistants, gaming, and interactive digital media.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_17430
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation Fernandes, Jose Geraldo Nascimento, Sinval Dominguete, Daniel Oliveira, André Rotsen, Lucas Souza, Gabriel Brochero, David Facury, Luiz Vilela, Mateus Costa, Hebert Coelho, Frederico Braga, Antônio P. Audio and Speech Processing In many applications, synchronizing audio with visuals is crucial, such as in creating graphic animations for films or games, translating movie audio into different languages, and developing metaverse applications. This review explores various methodologies for achieving realistic facial animations from audio inputs, highlighting generative and adaptive models. Addressing challenges like model training costs, dataset availability, and silent moment distributions in audio data, it presents innovative solutions to enhance performance and realism. The research also introduces a new taxonomy to categorize audio-visual synchronization methods based on logistical aspects, advancing the capabilities of virtual assistants, gaming, and interactive digital media.
title	A Comprehensive Review and Taxonomy of Audio-Visual Synchronization Techniques for Realistic Speech Animation
topic	Audio and Speech Processing
url	https://arxiv.org/abs/2407.17430

Similar Items