Saved in:
| Main Authors: | He, Mingrui, Xu, Longting, Wang, Han, Zhang, Mingjun, Das, Rohan Kumar |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.17280 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-modal Speech Enhancement with Limited Electromyography Channels
by: Feng, Fuyuan, et al.
Published: (2025)
by: Feng, Fuyuan, et al.
Published: (2025)
XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Listen, Analyze, and Adapt to Learn New Attacks: An Exemplar-Free Class Incremental Learning Method for Audio Deepfake Source Tracing
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Where's That Voice Coming? Continual Learning for Sound Source Localization
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
TF-Mamba: A Time-Frequency Network for Sound Source Localization
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Replay Attacks Against Audio Deepfake Detection
by: Müller, Nicolas, et al.
Published: (2025)
by: Müller, Nicolas, et al.
Published: (2025)
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
by: Zhang, Zixing, et al.
Published: (2024)
by: Zhang, Zixing, et al.
Published: (2024)
Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
by: Huang, Lian, et al.
Published: (2024)
by: Huang, Lian, et al.
Published: (2024)
Distil-DCCRN: A Small-footprint DCCRN Leveraging Feature-based Knowledge Distillation in Speech Enhancement
by: Han, Runduo, et al.
Published: (2024)
by: Han, Runduo, et al.
Published: (2024)
Multilingual Source Tracing of Speech Deepfakes: A First Benchmark
by: Xuan, Xi, et al.
Published: (2025)
by: Xuan, Xi, et al.
Published: (2025)
Multi-Channel Replay Speech Detection using Acoustic Maps
by: Neri, Michael, et al.
Published: (2026)
by: Neri, Michael, et al.
Published: (2026)
Examining the Interplay Between Privacy and Fairness for Speech Processing: A Review and Perspective
by: Leschanowsky, Anna, et al.
Published: (2024)
by: Leschanowsky, Anna, et al.
Published: (2024)
Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025)
by: Tripathi, Rajeshwar, et al.
Published: (2025)
AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
A Lightweight Fourier-based Network for Binaural Speech Enhancement with Spatial Cue Preservation
by: Lu, Xikun, et al.
Published: (2025)
by: Lu, Xikun, et al.
Published: (2025)
EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
Acoustic Simulation Framework for Multi-channel Replay Speech Detection
by: Neri, Michael, et al.
Published: (2025)
by: Neri, Michael, et al.
Published: (2025)
Abusive Speech Detection in Indic Languages Using Acoustic Features
by: Spiesberger, Anika A., et al.
Published: (2024)
by: Spiesberger, Anika A., et al.
Published: (2024)
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer
by: Liu, Huadai, et al.
Published: (2023)
by: Liu, Huadai, et al.
Published: (2023)
Unmasking Deepfakes: Leveraging Augmentations and Features Variability for Deepfake Speech Detection
by: Rimon, Inbal, et al.
Published: (2025)
by: Rimon, Inbal, et al.
Published: (2025)
Speech-Declipping Transformer with Complex Spectrogram and Learnerble Temporal Features
by: Kwon, Younghoo, et al.
Published: (2024)
by: Kwon, Younghoo, et al.
Published: (2024)
Towards Scalable AASIST: Refining Graph Attention for Speech Deepfake Detection
by: Viakhirev, Ivan, et al.
Published: (2025)
by: Viakhirev, Ivan, et al.
Published: (2025)
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
by: Liu, Tianchi, et al.
Published: (2025)
by: Liu, Tianchi, et al.
Published: (2025)
Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement
by: Pandey, Ashutosh, et al.
Published: (2024)
by: Pandey, Ashutosh, et al.
Published: (2024)
Speaker Anonymisation for Speech-based Suicide Risk Detection
by: Cui, Ziyun, et al.
Published: (2025)
by: Cui, Ziyun, et al.
Published: (2025)
Reverse Attention for Lightweight Speech Enhancement on Edge Devices
by: Ojha, Shuubham, et al.
Published: (2025)
by: Ojha, Shuubham, et al.
Published: (2025)
EnvSDD: Benchmarking Environmental Sound Deepfake Detection
by: Yin, Han, et al.
Published: (2025)
by: Yin, Han, et al.
Published: (2025)
Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features
by: Tsunoo, Emiru, et al.
Published: (2024)
by: Tsunoo, Emiru, et al.
Published: (2024)
Comparative Analysis of ASR Methods for Speech Deepfake Detection
by: Salvi, Davide, et al.
Published: (2024)
by: Salvi, Davide, et al.
Published: (2024)
Contrastive Loss Based Frame-wise Feature disentanglement for Polyphonic Sound Event Detection
by: Guan, Yadong, et al.
Published: (2024)
by: Guan, Yadong, et al.
Published: (2024)
An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)
by: Zhang, Qiquan, et al.
Published: (2024)
Transformers in Speech Processing: A Survey
by: Latif, Siddique, et al.
Published: (2023)
by: Latif, Siddique, et al.
Published: (2023)
Naturalness-Aware Curriculum Learning with Dynamic Temperature for Speech Deepfake Detection
by: Kim, Taewoo, et al.
Published: (2025)
by: Kim, Taewoo, et al.
Published: (2025)
DiTSE: High-Fidelity Generative Speech Enhancement via Latent Diffusion Transformers
by: Guimarães, Heitor R., et al.
Published: (2025)
by: Guimarães, Heitor R., et al.
Published: (2025)
Similar Items
-
Multi-modal Speech Enhancement with Limited Electromyography Channels
by: Feng, Fuyuan, et al.
Published: (2025) -
XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection
by: Xiao, Yang, et al.
Published: (2024) -
Listen, Analyze, and Adapt to Learn New Attacks: An Exemplar-Free Class Incremental Learning Method for Audio Deepfake Source Tracing
by: Xiao, Yang, et al.
Published: (2025) -
RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing
by: Xiao, Yang, et al.
Published: (2025) -
UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection
by: Xiao, Yang, et al.
Published: (2024)