Saved in:
Bibliographic Details
Main Authors: Shahan, Irfan Nafiz, Auvi, Pulok Ahmed
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2411.15082
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912130637234176
author Shahan, Irfan Nafiz
Auvi, Pulok Ahmed
author_facet Shahan, Irfan Nafiz
Auvi, Pulok Ahmed
contents Voice recognition and speaker identification are vital for applications in security and personal assistants. This paper presents a lightweight 1D-Convolutional Neural Network (1D-CNN) designed to perform speaker identification on minimal datasets. Our approach achieves a validation accuracy of 97.87%, leveraging data augmentation techniques to handle background noise and limited training samples. Future improvements include testing on larger datasets and integrating transfer learning methods to enhance generalizability. We provide all code, the custom dataset, and the trained models to facilitate reproducibility. These resources are available on our GitHub repository: https://github.com/IrfanNafiz/RecMe.
format Preprint
id arxiv_https___arxiv_org_abs_2411_15082
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network
Shahan, Irfan Nafiz
Auvi, Pulok Ahmed
Sound
Artificial Intelligence
Machine Learning
Audio and Speech Processing
Voice recognition and speaker identification are vital for applications in security and personal assistants. This paper presents a lightweight 1D-Convolutional Neural Network (1D-CNN) designed to perform speaker identification on minimal datasets. Our approach achieves a validation accuracy of 97.87%, leveraging data augmentation techniques to handle background noise and limited training samples. Future improvements include testing on larger datasets and integrating transfer learning methods to enhance generalizability. We provide all code, the custom dataset, and the trained models to facilitate reproducibility. These resources are available on our GitHub repository: https://github.com/IrfanNafiz/RecMe.
title Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network
topic Sound
Artificial Intelligence
Machine Learning
Audio and Speech Processing
url https://arxiv.org/abs/2411.15082