Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.20130 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915751134232576 |
|---|---|
| author | Zhang, Dongzhe Chen, Jianfeng Bai, Jisheng Wang, Mou Shi, Dongyuan Niu, Qixiang Bernardini, Alberto |
| author_facet | Zhang, Dongzhe Chen, Jianfeng Bai, Jisheng Wang, Mou Shi, Dongyuan Niu, Qixiang Bernardini, Alberto |
| contents | Deep learning-based sound event localization and classification is an emerging research area within wireless acoustic sensor networks. However, current methods for sound event localization and classification typically rely on a single microphone array, making them susceptible to signal attenuation and environmental noise, which limits their monitoring range. Moreover, methods using multiple microphone arrays often focus solely on source localization, neglecting the aspect of sound event classification. In this paper, we propose a deep learning-based method that employs multiple features and attention mechanisms to estimate the location and class of sound source. We introduce a Soundmap feature to capture spatial information across multiple frequency bands. We also use the Gammatone filter to generate acoustic features more suitable for outdoor environments. Furthermore, we integrate attention mechanisms to learn channel-wise relationships and temporal dependencies within the acoustic features. To evaluate our proposed method, we conduct experiments using simulated datasets with different levels of noise and size of monitoring areas, as well as different arrays and source positions. The experimental results demonstrate the superiority of our proposed method over state-of-the-art methods in both sound event classification and sound source localization tasks. And we provide further analysis to explain the reasons for the observed errors. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2403_20130 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Sound event localization and classification using WASN in Outdoor Environment Zhang, Dongzhe Chen, Jianfeng Bai, Jisheng Wang, Mou Shi, Dongyuan Niu, Qixiang Bernardini, Alberto Sound Machine Learning Audio and Speech Processing Deep learning-based sound event localization and classification is an emerging research area within wireless acoustic sensor networks. However, current methods for sound event localization and classification typically rely on a single microphone array, making them susceptible to signal attenuation and environmental noise, which limits their monitoring range. Moreover, methods using multiple microphone arrays often focus solely on source localization, neglecting the aspect of sound event classification. In this paper, we propose a deep learning-based method that employs multiple features and attention mechanisms to estimate the location and class of sound source. We introduce a Soundmap feature to capture spatial information across multiple frequency bands. We also use the Gammatone filter to generate acoustic features more suitable for outdoor environments. Furthermore, we integrate attention mechanisms to learn channel-wise relationships and temporal dependencies within the acoustic features. To evaluate our proposed method, we conduct experiments using simulated datasets with different levels of noise and size of monitoring areas, as well as different arrays and source positions. The experimental results demonstrate the superiority of our proposed method over state-of-the-art methods in both sound event classification and sound source localization tasks. And we provide further analysis to explain the reasons for the observed errors. |
| title | Sound event localization and classification using WASN in Outdoor Environment |
| topic | Sound Machine Learning Audio and Speech Processing |
| url | https://arxiv.org/abs/2403.20130 |