Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tathe, Aniket, Kamble, Anand, Kumbharkar, Suyash, Bhandare, Atharva, Mitra, Anirban C.
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Artificial Intelligence Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2401.06183
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Speech has long been a barrier to effective communication and connection, persisting as a challenge in our increasingly interconnected world. This research paper introduces a transformative solution to this persistent obstacle an end-to-end speech conversion framework tailored for Hindi-to-English translation, culminating in the synthesis of English audio. By integrating cutting-edge technologies such as XLSR Wav2Vec2 for automatic speech recognition (ASR), mBART for neural machine translation (NMT), and a Text-to-Speech (TTS) synthesis component, this framework offers a unified and seamless approach to cross-lingual communication. We delve into the intricate details of each component, elucidating their individual contributions and exploring the synergies that enable a fluid transition from spoken Hindi to synthesized English audio.

Similar Items