:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, T. Aleksandra, Yin, Sile, Yang, Li-Chia, Zhang, Shuo
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Emerging Technologies Machine Learning
Online Access:	https://arxiv.org/abs/2509.20741
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Real-Time Audio-Visual Speech Enhancement Using Pre-trained Visual Representations
by: Ma, T. Aleksandra, et al.
Published: (2025)

ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024)

IoT-based Noise Monitoring using Mobile Nodes for Smart Cities
by: Manthina, Bhima Sankar, et al.
Published: (2025)

Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025)

AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile Phones
by: Liu, Xuanyu, et al.
Published: (2024)

NEUROSEC: FPGA-Based Neuromorphic Audio Security
by: Isik, Murat, et al.
Published: (2024)

EgoTrigger: Toward Audio-Driven Image Capture for Human Memory Enhancement in All-Day Energy-Efficient Smart Glasses
by: Paruchuri, Akshay, et al.
Published: (2025)

Spike Encoding for Environmental Sound: A Comparative Benchmark
by: Larroza, Andres, et al.
Published: (2025)

Acoustic Anomaly Detection on UAM Propeller Defect with Acoustic dataset for Crack of drone Propeller (ADCP)
by: Lee, Juho, et al.
Published: (2025)

An ambient denoising method based on multi-channel non-negative matrix factorization for wheezing detection
by: Muñoz-Montoro, Antonio J., et al.
Published: (2024)

Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement
by: Lin, Meng-Ping, et al.
Published: (2025)

Techniques for Quantum-Computing-Aided Algorithmic Composition: Experiments in Rhythm, Timbre, Harmony, and Space
by: Dobrian, Christopher, et al.
Published: (2025)

Intro to Quantum Harmony: Chords in Superposition
by: Dobrian, Christopher, et al.
Published: (2024)

Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks
by: Gonzalez-Martinez, F. D., et al.
Published: (2024)

Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition
by: Wu, Linzhi, et al.
Published: (2026)

Scalable Frameworks for Real-World Audio-Visual Speech Recognition
by: Kim, Sungnyun
Published: (2025)

High-Fidelity Speech Enhancement via Discrete Audio Tokens
by: Lanzendörfer, Luca A., et al.
Published: (2025)

Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
by: Dementyev, Artem, et al.
Published: (2024)

LSTMSE-Net: Long Short Term Speech Enhancement Network for Audio-visual Speech Enhancement
by: Jain, Arnav, et al.
Published: (2024)

Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction
by: Wu, Wenxuan, et al.
Published: (2025)

Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)

Variational Quantum Harmonizer: Generating Chord Progressions and Other Sonification Methods with the VQE Algorithm
by: Itaboraí, Paulo Vitor, et al.
Published: (2023)

Developing a Framework for Sonifying Variational Quantum Algorithms: Implications for Music Composition
by: Itaboraí, Paulo Vitor, et al.
Published: (2024)

aTENNuate: Optimized Real-time Speech Enhancement with Deep SSMs on Raw Audio
by: Pei, Yan Ru, et al.
Published: (2024)

Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior
by: Yemini, Yochai, et al.
Published: (2025)

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
by: Zhao, Shengkui, et al.
Published: (2021)

Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)

Towards Audio Codec-based Speech Separation
by: Yip, Jia Qi, et al.
Published: (2024)

DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration
by: Serbest, Sanberk, et al.
Published: (2025)

Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
by: Makarov, Rostislav, et al.
Published: (2025)

SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
by: Yemini, Yochai, et al.
Published: (2026)

FSD50K-Solo: Automated Curation of Single-Source Sound Events
by: Yang, Ningyuan, et al.
Published: (2026)

Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
by: Mu, Zhaoxi, et al.
Published: (2024)

Predictive-Generative Drift Decomposition for Speech Enhancement and Separation
by: Richter, Julius, et al.
Published: (2026)

A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments
by: Khondkar, Md Jahangir Alam, et al.
Published: (2025)

Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement
by: Yang, Yudong, et al.
Published: (2024)

TF-MLPNet: Tiny Real-Time Neural Speech Separation
by: Itani, Malek, et al.
Published: (2025)

DDTSE: Discriminative Diffusion Model for Target Speech Extraction
by: Zhang, Leying, et al.
Published: (2023)

TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
by: Chen, Chengxin, et al.
Published: (2024)

Audio-Visual Speech Enhancement for Spatial Audio - Spatial-VisualVoice and the MAVE Database
by: Yaffe, Danielle, et al.
Published: (2025)