Saved in:
| Main Authors: | Sashida, Kurumi, Tanaka, Gouhei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06271 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Robust Bioacoustic Detection via Richly Labelled Synthetic Soundscape Augmentation
by: Soltero, Kaspar, et al.
Published: (2025)
by: Soltero, Kaspar, et al.
Published: (2025)
Generating Diverse Audio-Visual 360 Soundscapes for Sound Event Localization and Detection
by: Roman, Adrian S., et al.
Published: (2025)
by: Roman, Adrian S., et al.
Published: (2025)
Soundscape Captioning using Sound Affective Quality Network and Large Language Model
by: Hou, Yuanbo, et al.
Published: (2024)
by: Hou, Yuanbo, et al.
Published: (2024)
Effective Pre-Training of Audio Transformers for Sound Event Detection
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
PSELDNets: Pre-trained Neural Networks on a Large-scale Synthetic Dataset for Sound Event Localization and Detection
by: Hu, Jinbo, et al.
Published: (2024)
by: Hu, Jinbo, et al.
Published: (2024)
Sound Tagging in Infant-centric Home Soundscapes
by: Khan, Mohammad Nur Hossain, et al.
Published: (2024)
by: Khan, Mohammad Nur Hossain, et al.
Published: (2024)
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms
by: Roman, Iran R., et al.
Published: (2024)
by: Roman, Iran R., et al.
Published: (2024)
Feature Selection via Graph Topology Inference for Soundscape Emotion Recognition
by: Rey, Samuel, et al.
Published: (2025)
by: Rey, Samuel, et al.
Published: (2025)
Autonomous Soundscape Augmentation with Multimodal Fusion of Visual and Participant-linked Inputs
by: Ooi, Kenneth, et al.
Published: (2023)
by: Ooi, Kenneth, et al.
Published: (2023)
Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations
by: Girish, et al.
Published: (2025)
by: Girish, et al.
Published: (2025)
ARAUS: A Large-Scale Dataset and Baseline Models of Affective Responses to Augmented Urban Soundscapes
by: Ooi, Kenneth, et al.
Published: (2022)
by: Ooi, Kenneth, et al.
Published: (2022)
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
by: Santos, Orlem Lima dos, et al.
Published: (2023)
by: Santos, Orlem Lima dos, et al.
Published: (2023)
Evaluating CNN with Stacked Feature Representations and Audio Spectrogram Transformer Models for Sound Classification
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
by: Wang, Zhiyong, et al.
Published: (2024)
by: Wang, Zhiyong, et al.
Published: (2024)
Retrieval-Augmented Approach for Unsupervised Anomalous Sound Detection and Captioning without Model Training
by: Ogura, Ryoya, et al.
Published: (2024)
by: Ogura, Ryoya, et al.
Published: (2024)
Reducing the Gap Between Pretrained Speech Enhancement and Recognition Models Using a Real Speech-Trained Bridging Module
by: Cui, Zhongjian, et al.
Published: (2025)
by: Cui, Zhongjian, et al.
Published: (2025)
CNN-based Robust Sound Source Localization with SRP-PHAT for the Extreme Edge
by: Yin, Jun, et al.
Published: (2025)
by: Yin, Jun, et al.
Published: (2025)
Generating Moving 3D Soundscapes with Latent Diffusion Models
by: Templin, Christian, et al.
Published: (2025)
by: Templin, Christian, et al.
Published: (2025)
Automating Urban Soundscape Enhancements with AI: In-situ Assessment of Quality and Restorativeness in Traffic-Exposed Residential Areas
by: Lam, Bhan, et al.
Published: (2024)
by: Lam, Bhan, et al.
Published: (2024)
Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled Conditions
by: Fujimura, Takuya, et al.
Published: (2024)
by: Fujimura, Takuya, et al.
Published: (2024)
EmoFormer: A Text-Independent Speech Emotion Recognition using a Hybrid Transformer-CNN model
by: Hasan, Rashedul, et al.
Published: (2025)
by: Hasan, Rashedul, et al.
Published: (2025)
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
by: Zheng, Xinhu, et al.
Published: (2024)
by: Zheng, Xinhu, et al.
Published: (2024)
Fine-grained Soundscape Control for Augmented Hearing
by: Oh, Seunghyun, et al.
Published: (2026)
by: Oh, Seunghyun, et al.
Published: (2026)
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024)
by: Dawn, Aditya, et al.
Published: (2024)
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Temporal Pooling Strategies for Training-Free Anomalous Sound Detection with Self-Supervised Audio Embeddings
by: Wilkinghoff, Kevin, et al.
Published: (2026)
by: Wilkinghoff, Kevin, et al.
Published: (2026)
ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds
by: Han, Jiho, et al.
Published: (2025)
by: Han, Jiho, et al.
Published: (2025)
The TMU System for the XACLE Challenge: Training Large Audio Language Models with CLAP Pseudo-Labels
by: Tsutsumi, Ayuto, et al.
Published: (2026)
by: Tsutsumi, Ayuto, et al.
Published: (2026)
Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals
by: Fedorishin, Dennis, et al.
Published: (2024)
by: Fedorishin, Dennis, et al.
Published: (2024)
Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models
by: Chen, Li-Wei, et al.
Published: (2024)
by: Chen, Li-Wei, et al.
Published: (2024)
Sound Field Reconstruction Using a Compact Acoustics-informed Neural Network
by: Ma, Fei, et al.
Published: (2024)
by: Ma, Fei, et al.
Published: (2024)
MAGENTA: Magnitude and Geometry-ENhanced Training Approach for Robust Long-Tailed Sound Event Localization and Detection
by: Yeow, Jun-Wei, et al.
Published: (2025)
by: Yeow, Jun-Wei, et al.
Published: (2025)
How Much Does Machine Identity Matter in Anomalous Sound Detection at Test Time?
by: Wilkinghoff, Kevin, et al.
Published: (2026)
by: Wilkinghoff, Kevin, et al.
Published: (2026)
FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
by: Comanducci, Luca, et al.
Published: (2024)
by: Comanducci, Luca, et al.
Published: (2024)
Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks
by: Jiang, Zifan, et al.
Published: (2023)
by: Jiang, Zifan, et al.
Published: (2023)
NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks
by: Barahona-Ríos, Adrián, et al.
Published: (2023)
by: Barahona-Ríos, Adrián, et al.
Published: (2023)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Multichannel Voice Trigger Detection Based on Transform-average-concatenate
by: Higuchi, Takuya, et al.
Published: (2023)
by: Higuchi, Takuya, et al.
Published: (2023)
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained Models
by: Zhong, Guirui, et al.
Published: (2025)
by: Zhong, Guirui, et al.
Published: (2025)
Joint Analysis of Acoustic Scenes and Sound Events Based on Semi-Supervised Training of Sound Events With Partial Labels
by: Imoto, Keisuke
Published: (2025)
by: Imoto, Keisuke
Published: (2025)
Similar Items
-
Robust Bioacoustic Detection via Richly Labelled Synthetic Soundscape Augmentation
by: Soltero, Kaspar, et al.
Published: (2025) -
Generating Diverse Audio-Visual 360 Soundscapes for Sound Event Localization and Detection
by: Roman, Adrian S., et al.
Published: (2025) -
Soundscape Captioning using Sound Affective Quality Network and Large Language Model
by: Hou, Yuanbo, et al.
Published: (2024) -
Effective Pre-Training of Audio Transformers for Sound Event Detection
by: Schmid, Florian, et al.
Published: (2024) -
PSELDNets: Pre-trained Neural Networks on a Large-scale Synthetic Dataset for Sound Event Localization and Detection
by: Hu, Jinbo, et al.
Published: (2024)