Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kässmann, Tobias, Liu, Yining, Liu, Danni
Format:	Preprint
Published:	2024
Subjects:	Sound Computation and Language Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2407.17172
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917731797827584
author	Kässmann, Tobias Liu, Yining Liu, Danni
author_facet	Kässmann, Tobias Liu, Yining Liu, Danni
contents	With the rise of video production and social media, speech editing has become crucial for creators to address issues like mispronunciations, missing words, or stuttering in audio recordings. This paper explores text-based speech editing methods that modify audio via text transcripts without manual waveform editing. These approaches ensure edited audio is indistinguishable from the original by altering the mel-spectrogram. Recent advancements, such as context-aware prosody correction and advanced attention mechanisms, have improved speech editing quality. This paper reviews state-of-the-art methods, compares key metrics, and examines widely used datasets. The aim is to highlight ongoing issues and inspire further research and innovation in speech editing.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_17172
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Speech Editing -- a Summary Kässmann, Tobias Liu, Yining Liu, Danni Sound Computation and Language Audio and Speech Processing With the rise of video production and social media, speech editing has become crucial for creators to address issues like mispronunciations, missing words, or stuttering in audio recordings. This paper explores text-based speech editing methods that modify audio via text transcripts without manual waveform editing. These approaches ensure edited audio is indistinguishable from the original by altering the mel-spectrogram. Recent advancements, such as context-aware prosody correction and advanced attention mechanisms, have improved speech editing quality. This paper reviews state-of-the-art methods, compares key metrics, and examines widely used datasets. The aim is to highlight ongoing issues and inspire further research and innovation in speech editing.
title	Speech Editing -- a Summary
topic	Sound Computation and Language Audio and Speech Processing
url	https://arxiv.org/abs/2407.17172

Similar Items