:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yin, Hao, Guo, Shi, Jia, Xu, XU, Xudong, Zhang, Lu, Liu, Si, Wang, Dong, Lu, Huchuan, Xue, Tianfan
Format:	Preprint
Published:	2025
Subjects:	Sound Artificial Intelligence Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2504.02402
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The UmboMic: A PVDF Cantilever Microphone
by: Yeiser, Aaron J., et al.
Published: (2023)

Adaptive high-precision sound source localization at low frequencies based on convolutional neural network
by: Ma, Wenbo, et al.
Published: (2024)

Text2Move: Text-to-moving sound generation via trajectory prediction and temporal alignment
by: Liu, Yunyi, et al.
Published: (2025)

Signal processing algorithm effective for sound quality of hearing loss simulators
by: Irino, Toshio, et al.
Published: (2024)

Few-Shot Bioacoustic Event Detection with Frame-Level Embedding Learning System
by: Zhao, PengYuan, et al.
Published: (2024)

Simi-SFX: A similarity-based conditioning method for controllable sound effect synthesis
by: Liu, Yunyi, et al.
Published: (2024)

Ensemble Confidence Calibration for Sound Event Detection in Open-environment
by: Chen, Yuanjian, et al.
Published: (2025)

Enhancing Stereo Sound Event Detection with BiMamba and Pretrained PSELDnet
by: Gao, Wenmiao, et al.
Published: (2025)

Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection
by: Yue, Haobo, et al.
Published: (2024)

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
by: Fan, Zhiyun, et al.
Published: (2024)

Differentiable physics for sound field reconstruction
by: Verburg, Samuel A., et al.
Published: (2025)

Exploiting spatial diversity for increasing the robustness of sound source localization systems against reverberation
by: Garcia-Barrios, Guillermo, et al.
Published: (2024)

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
by: Xue, Hongfei, et al.
Published: (2024)

Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision
by: Jia, Zhijun, et al.
Published: (2024)

A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)

Frequency-aware convolution for sound event detection
by: Song, Tao, et al.
Published: (2024)

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
by: Yin, Han, et al.
Published: (2024)

Lightweight Implicit Neural Network for Binaural Audio Synthesis
by: Lu, Xikun, et al.
Published: (2025)

Combining Deterministic Enhanced Conditions with Dual-Streaming Encoding for Diffusion-Based Speech Enhancement
by: Shi, Hao, et al.
Published: (2025)

Low-latency Speech Enhancement via Speech Token Generation
by: Xue, Huaying, et al.
Published: (2023)

The Neural-SRP method for positional sound source localization
by: Grinstein, Eric, et al.
Published: (2024)

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation
by: Chen, Yuanjian, et al.
Published: (2025)

Some clues to build a sound analysis relevant to hearing
by: Millot, Laurent
Published: (2024)

Interaural time difference loss for binaural target sound extraction
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)

Onset and offset weighted loss function for sound event detection
by: Song, Tao
Published: (2024)

Fine-tune the pretrained ATST model for sound event detection
by: Shao, Nian, et al.
Published: (2023)

FSD50K-Solo: Automated Curation of Single-Source Sound Events
by: Yang, Ningyuan, et al.
Published: (2026)

Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection
by: Hu, Jinbo, et al.
Published: (2023)

An adaptive filter bank based neural network approach for time delay estimation and speech enhancement
by: Ma, Lu
Published: (2025)

Multimodal Consistency-Guided Reference-Free Data Selection for ASR Accent Adaptation
by: Lei, Ligong, et al.
Published: (2026)

A k-space approach to modeling multi-channel parametric array loudspeaker systems
by: Zhuang, Tao, et al.
Published: (2025)

Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)

FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels
by: Xiao, Yang, et al.
Published: (2024)

Representational learning for an anomalous sound detection system with source separation model
by: Shin, Seunghyeon, et al.
Published: (2024)

The role of direct sound spherical harmonics representation in externalization using binaural reproduction
by: Miller, Eran, et al.
Published: (2024)

InsectSet459: an open dataset of insect sounds for bioacoustic machine learning
by: Faiß, Marius, et al.
Published: (2025)

Multispecies bird sound recognition using a fully convolutional neural network
by: García-Ordás, María Teresa, et al.
Published: (2024)

FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation
by: Liu, Huadai, et al.
Published: (2024)

Efficient learning-based sound propagation for virtual and real-world audio processing applications
by: Ratnarajah, Anton Jeran
Published: (2024)

Binaural sound source localization using a hybrid time and frequency domain model
by: Geva, Gil, et al.
Published: (2024)