Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.01563 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866916821590867968 |
|---|---|
| author | Giordano, Marco Giacomelli, Stefano Rinaldi, Claudia Graziosi, Fabio |
| author_facet | Giordano, Marco Giacomelli, Stefano Rinaldi, Claudia Graziosi, Fabio |
| contents | We present a full-stack emergency vehicle (EV) siren detection system designed for real-time deployment on embedded hardware. The proposed approach is based on E2PANNs, a fine-tuned convolutional neural network derived from EPANNs, and optimized for binary sound event detection under urban acoustic conditions. A key contribution is the creation of curated and semantically structured datasets - AudioSet-EV, AudioSet-EV Augmented, and Unified-EV - developed using a custom AudioSet-Tools framework to overcome the low reliability of standard AudioSet annotations. The system is deployed on a Raspberry Pi 5 equipped with a high-fidelity DAC+microphone board, implementing a multithreaded inference engine with adaptive frame sizing, probability smoothing, and a decision-state machine to control false positive activations. A remote WebSocket interface provides real-time monitoring and facilitates live demonstration capabilities. Performance is evaluated using both framewise and event-based metrics across multiple configurations. Results show the system achieves low-latency detection with improved robustness under realistic audio conditions. This work demonstrates the feasibility of deploying IoS-compatible SED solutions that can form distributed acoustic monitoring networks, enabling collaborative emergency vehicle tracking across smart city infrastructures through WebSocket connectivity on low-cost edge devices. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2507_01563 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Real-Time Emergency Vehicle Siren Detection with Efficient CNNs on Embedded Hardware Giordano, Marco Giacomelli, Stefano Rinaldi, Claudia Graziosi, Fabio Sound Artificial Intelligence Audio and Speech Processing 68T07 (Primary), 68T10 (Secondary) B.1.5; B.4.5; C.3; C.4; I.2; K.4; J.2 We present a full-stack emergency vehicle (EV) siren detection system designed for real-time deployment on embedded hardware. The proposed approach is based on E2PANNs, a fine-tuned convolutional neural network derived from EPANNs, and optimized for binary sound event detection under urban acoustic conditions. A key contribution is the creation of curated and semantically structured datasets - AudioSet-EV, AudioSet-EV Augmented, and Unified-EV - developed using a custom AudioSet-Tools framework to overcome the low reliability of standard AudioSet annotations. The system is deployed on a Raspberry Pi 5 equipped with a high-fidelity DAC+microphone board, implementing a multithreaded inference engine with adaptive frame sizing, probability smoothing, and a decision-state machine to control false positive activations. A remote WebSocket interface provides real-time monitoring and facilitates live demonstration capabilities. Performance is evaluated using both framewise and event-based metrics across multiple configurations. Results show the system achieves low-latency detection with improved robustness under realistic audio conditions. This work demonstrates the feasibility of deploying IoS-compatible SED solutions that can form distributed acoustic monitoring networks, enabling collaborative emergency vehicle tracking across smart city infrastructures through WebSocket connectivity on low-cost edge devices. |
| title | Real-Time Emergency Vehicle Siren Detection with Efficient CNNs on Embedded Hardware |
| topic | Sound Artificial Intelligence Audio and Speech Processing 68T07 (Primary), 68T10 (Secondary) B.1.5; B.4.5; C.3; C.4; I.2; K.4; J.2 |
| url | https://arxiv.org/abs/2507.01563 |