Saved in:
Bibliographic Details
Main Authors: Zhang, Dongzhe, Chen, Jianfeng, Bai, Jisheng, Wang, Mou, Shi, Dongyuan, Niu, Qixiang, Bernardini, Alberto
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.20130
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915751134232576
author Zhang, Dongzhe
Chen, Jianfeng
Bai, Jisheng
Wang, Mou
Shi, Dongyuan
Niu, Qixiang
Bernardini, Alberto
author_facet Zhang, Dongzhe
Chen, Jianfeng
Bai, Jisheng
Wang, Mou
Shi, Dongyuan
Niu, Qixiang
Bernardini, Alberto
contents Deep learning-based sound event localization and classification is an emerging research area within wireless acoustic sensor networks. However, current methods for sound event localization and classification typically rely on a single microphone array, making them susceptible to signal attenuation and environmental noise, which limits their monitoring range. Moreover, methods using multiple microphone arrays often focus solely on source localization, neglecting the aspect of sound event classification. In this paper, we propose a deep learning-based method that employs multiple features and attention mechanisms to estimate the location and class of sound source. We introduce a Soundmap feature to capture spatial information across multiple frequency bands. We also use the Gammatone filter to generate acoustic features more suitable for outdoor environments. Furthermore, we integrate attention mechanisms to learn channel-wise relationships and temporal dependencies within the acoustic features. To evaluate our proposed method, we conduct experiments using simulated datasets with different levels of noise and size of monitoring areas, as well as different arrays and source positions. The experimental results demonstrate the superiority of our proposed method over state-of-the-art methods in both sound event classification and sound source localization tasks. And we provide further analysis to explain the reasons for the observed errors.
format Preprint
id arxiv_https___arxiv_org_abs_2403_20130
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Sound event localization and classification using WASN in Outdoor Environment
Zhang, Dongzhe
Chen, Jianfeng
Bai, Jisheng
Wang, Mou
Shi, Dongyuan
Niu, Qixiang
Bernardini, Alberto
Sound
Machine Learning
Audio and Speech Processing
Deep learning-based sound event localization and classification is an emerging research area within wireless acoustic sensor networks. However, current methods for sound event localization and classification typically rely on a single microphone array, making them susceptible to signal attenuation and environmental noise, which limits their monitoring range. Moreover, methods using multiple microphone arrays often focus solely on source localization, neglecting the aspect of sound event classification. In this paper, we propose a deep learning-based method that employs multiple features and attention mechanisms to estimate the location and class of sound source. We introduce a Soundmap feature to capture spatial information across multiple frequency bands. We also use the Gammatone filter to generate acoustic features more suitable for outdoor environments. Furthermore, we integrate attention mechanisms to learn channel-wise relationships and temporal dependencies within the acoustic features. To evaluate our proposed method, we conduct experiments using simulated datasets with different levels of noise and size of monitoring areas, as well as different arrays and source positions. The experimental results demonstrate the superiority of our proposed method over state-of-the-art methods in both sound event classification and sound source localization tasks. And we provide further analysis to explain the reasons for the observed errors.
title Sound event localization and classification using WASN in Outdoor Environment
topic Sound
Machine Learning
Audio and Speech Processing
url https://arxiv.org/abs/2403.20130