Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Shahan, Irfan Nafiz, Auvi, Pulok Ahmed
Format:	Preprint
Published:	2024
Subjects:	Sound Artificial Intelligence Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2411.15082
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912130637234176
author	Shahan, Irfan Nafiz Auvi, Pulok Ahmed
author_facet	Shahan, Irfan Nafiz Auvi, Pulok Ahmed
contents	Voice recognition and speaker identification are vital for applications in security and personal assistants. This paper presents a lightweight 1D-Convolutional Neural Network (1D-CNN) designed to perform speaker identification on minimal datasets. Our approach achieves a validation accuracy of 97.87%, leveraging data augmentation techniques to handle background noise and limited training samples. Future improvements include testing on larger datasets and integrating transfer learning methods to enhance generalizability. We provide all code, the custom dataset, and the trained models to facilitate reproducibility. These resources are available on our GitHub repository: https://github.com/IrfanNafiz/RecMe.
format	Preprint
id	arxiv_https___arxiv_org_abs_2411_15082
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network Shahan, Irfan Nafiz Auvi, Pulok Ahmed Sound Artificial Intelligence Machine Learning Audio and Speech Processing Voice recognition and speaker identification are vital for applications in security and personal assistants. This paper presents a lightweight 1D-Convolutional Neural Network (1D-CNN) designed to perform speaker identification on minimal datasets. Our approach achieves a validation accuracy of 97.87%, leveraging data augmentation techniques to handle background noise and limited training samples. Future improvements include testing on larger datasets and integrating transfer learning methods to enhance generalizability. We provide all code, the custom dataset, and the trained models to facilitate reproducibility. These resources are available on our GitHub repository: https://github.com/IrfanNafiz/RecMe.
title	Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network
topic	Sound Artificial Intelligence Machine Learning Audio and Speech Processing
url	https://arxiv.org/abs/2411.15082

Similar Items