:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fang, Zexin, Han, Bin, Sveen, Henrik H., Cao, C. Clark, Schotten, Hans D.
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2504.00621
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice
by: Zhang, Leying, et al.
Published: (2026)

Exploring the User Experience of AI-Assisted Sound Searching Systems for Creative Workflows
by: Liu, Haohe, et al.
Published: (2025)

Synthetic Speech Classification: IEEE Signal Processing Cup 2022 challenge
by: Rahmun, Mahieyin, et al.
Published: (2024)

A Practical Guide to Spectrogram Analysis for Audio Signal Processing
by: Khodzhaev, Zulfidin
Published: (2024)

Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning
by: Cai, Danwei, et al.
Published: (2024)

Generative AI in Signal Processing Education: An Audio Foundation Model Based Approach
by: Khan, Muhammad Salman, et al.
Published: (2026)

Modulation Discovery with Differentiable Digital Signal Processing
by: Mitcheltree, Christopher, et al.
Published: (2025)

The Database and Benchmark for the Source Speaker Tracing Challenge 2024
by: Li, Ze, et al.
Published: (2024)

Microphone Array Signal Processing and Deep Learning for Speech Enhancement
by: Haeb-Umbach, Reinhold, et al.
Published: (2025)

Rapidly Adapting to New Voice Spoofing: Few-Shot Detection of Synthesized Speech Under Distribution Shifts
by: Garg, Ashi, et al.
Published: (2025)

Scalable Controllable Accented TTS
by: Xinyuan, Henry Li, et al.
Published: (2025)

AI-Driven Cardiorespiratory Signal Processing: Separation, Clustering, and Anomaly Detection
by: Torabi, Yasaman
Published: (2026)

ClearerVoice-Studio: Bridging Advanced Speech Processing Research and Practical Deployment
by: Zhao, Shengkui, et al.
Published: (2025)

Dark Experience for Incremental Keyword Spotting
by: Peng, Tianyi, et al.
Published: (2024)

Generative Deep Learning and Signal Processing for Data Augmentation of Cardiac Auscultation Signals: Improving Model Robustness Using Synthetic Audio
by: Abbott, Leigh, et al.
Published: (2024)

Distinctive Feature Codec: An Adaptive Efficient Speech Representation for Depression Detection
by: Zhang, Xiangyu, et al.
Published: (2025)

dCoNNear: An Artifact-Free Neural Network Architecture for Closed-loop Audio Signal Processing
by: Wen, Chuan, et al.
Published: (2025)

Towards Developing State-of-the-Art TTS Synthesisers for 13 Indian Languages with Signal Processing aided Alignments
by: Prakash, Anusha, et al.
Published: (2022)

Improved Remixing Process for Domain Adaptation-Based Speech Enhancement by Mitigating Data Imbalance in Signal-to-Noise Ratio
by: Li, Li, et al.
Published: (2024)

Single Channel Blind Dereverberation of Speech Signals
by: Nigam, Dhruv
Published: (2025)

Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
by: Meng, Hanyu, et al.
Published: (2025)

Detecting gamma-band responses to the speech envelope for the ICASSP 2024 Auditory EEG Decoding Signal Processing Grand Challenge
by: Thornton, Mike, et al.
Published: (2024)

A Multi-Channel Auditory Signal Encoder with Adaptive Resolution Using Volatile Memristors
by: Guo, Dongxu, et al.
Published: (2025)

Performance and Robustness of Signal-Dependent vs. Signal-Independent Binaural Signal Matching with Wearable Microphone Arrays
by: Berger, Ami, et al.
Published: (2024)

ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts
by: Garg, Ashi, et al.
Published: (2025)

Transferable Adversarial Attacks against ASR
by: Gao, Xiaoxue, et al.
Published: (2024)

Binaural Signal Matching with Wearable Arrays for Near-Field Sources
by: Goldring, Sapir, et al.
Published: (2025)

ULTRAS -- Unified Learning of Transformer Representations for Audio and Speech Signals
by: E, Ameenudeen P, et al.
Published: (2026)

Spatial Audio Signal Enhancement: A Multi-output MVDR Method in The Spherical Harmonic-domain
by: Zhang, Huawei, et al.
Published: (2024)

Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation
by: Ai, Zhiqi, et al.
Published: (2026)

Can LLMs Help Localize Fake Words in Partially Fake Speech?
by: Zhang, Lin, et al.
Published: (2026)

Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement
by: Fujimura, Takuya, et al.
Published: (2025)

Fundamentals of Data-Driven Approaches to Acoustic Signal Detection, Filtering, and Transformation
by: Pan, Chao
Published: (2025)

Speech Quality Embeddings for Improved Detection and Classification of Degradations in Speech Signals
by: Kuhlmann, Michael, et al.
Published: (2026)

RRP-Voice: A Longitudinal Dataset and Benchmark for Recurrent Respiratory Papillomatosis Detection
by: Ren, Wenze, et al.
Published: (2026)

Universal Speech Content Factorization
by: Xinyuan, Henry Li, et al.
Published: (2026)

Binaural Signal Matching with Wearable Arrays for Near-Field Sources and Directional Focus
by: Goldring, Sapir, et al.
Published: (2025)

Physics-Informed Neural Network for Volumetric Sound field Reconstruction of Speech Signals
by: Olivieri, Marco, et al.
Published: (2024)

Concerns for Self-Localization of Ad-Hoc Arrays Using Time Difference of Arrivals
by: Cao, Faxian
Published: (2024)

Harmonics to the Rescue: Why Voiced Speech is Not a Wss Process
by: Bologni, Giovanni, et al.
Published: (2025)