Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kapu, Nirmal Joshua, Karan, Raghav
Format:	Preprint
Published:	2024
Subjects:	Sound Artificial Intelligence Computation and Language Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2411.18636
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917850848952320
author	Kapu, Nirmal Joshua Karan, Raghav
author_facet	Kapu, Nirmal Joshua Karan, Raghav
contents	This article surveys convolution-based models including convolutional neural networks (CNNs), Conformers, ResNets, and CRNNs-as speech signal processing models and provide their statistical backgrounds and speech recognition, speaker identification, emotion recognition, and speech enhancement applications. Through comparative training cost assessment, model size, accuracy and speed assessment, we compare the strengths and weaknesses of each model, identify potential errors and propose avenues for further research, emphasizing the central role it plays in advancing applications of speech technologies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_18636
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications Kapu, Nirmal Joshua Karan, Raghav Sound Artificial Intelligence Computation and Language Audio and Speech Processing This article surveys convolution-based models including convolutional neural networks (CNNs), Conformers, ResNets, and CRNNs-as speech signal processing models and provide their statistical backgrounds and speech recognition, speaker identification, emotion recognition, and speech enhancement applications. Through comparative training cost assessment, model size, accuracy and speed assessment, we compare the strengths and weaknesses of each model, identify potential errors and propose avenues for further research, emphasizing the central role it plays in advancing applications of speech technologies.
title	Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications
topic	Sound Artificial Intelligence Computation and Language Audio and Speech Processing
url	https://arxiv.org/abs/2411.18636

Similar Items