Saved in:
| Main Authors: | Ma, T. Aleksandra, Yin, Sile, Yang, Li-Chia, Zhang, Shuo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.20741 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Real-Time Audio-Visual Speech Enhancement Using Pre-trained Visual Representations
by: Ma, T. Aleksandra, et al.
Published: (2025)
by: Ma, T. Aleksandra, et al.
Published: (2025)
ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024)
by: Heydari, Mojtaba, et al.
Published: (2024)
IoT-based Noise Monitoring using Mobile Nodes for Smart Cities
by: Manthina, Bhima Sankar, et al.
Published: (2025)
by: Manthina, Bhima Sankar, et al.
Published: (2025)
Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025)
by: Tripathi, Rajeshwar, et al.
Published: (2025)
AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile Phones
by: Liu, Xuanyu, et al.
Published: (2024)
by: Liu, Xuanyu, et al.
Published: (2024)
NEUROSEC: FPGA-Based Neuromorphic Audio Security
by: Isik, Murat, et al.
Published: (2024)
by: Isik, Murat, et al.
Published: (2024)
EgoTrigger: Toward Audio-Driven Image Capture for Human Memory Enhancement in All-Day Energy-Efficient Smart Glasses
by: Paruchuri, Akshay, et al.
Published: (2025)
by: Paruchuri, Akshay, et al.
Published: (2025)
Spike Encoding for Environmental Sound: A Comparative Benchmark
by: Larroza, Andres, et al.
Published: (2025)
by: Larroza, Andres, et al.
Published: (2025)
Acoustic Anomaly Detection on UAM Propeller Defect with Acoustic dataset for Crack of drone Propeller (ADCP)
by: Lee, Juho, et al.
Published: (2025)
by: Lee, Juho, et al.
Published: (2025)
An ambient denoising method based on multi-channel non-negative matrix factorization for wheezing detection
by: Muñoz-Montoro, Antonio J., et al.
Published: (2024)
by: Muñoz-Montoro, Antonio J., et al.
Published: (2024)
Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement
by: Lin, Meng-Ping, et al.
Published: (2025)
by: Lin, Meng-Ping, et al.
Published: (2025)
Techniques for Quantum-Computing-Aided Algorithmic Composition: Experiments in Rhythm, Timbre, Harmony, and Space
by: Dobrian, Christopher, et al.
Published: (2025)
by: Dobrian, Christopher, et al.
Published: (2025)
Intro to Quantum Harmony: Chords in Superposition
by: Dobrian, Christopher, et al.
Published: (2024)
by: Dobrian, Christopher, et al.
Published: (2024)
Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks
by: Gonzalez-Martinez, F. D., et al.
Published: (2024)
by: Gonzalez-Martinez, F. D., et al.
Published: (2024)
Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition
by: Wu, Linzhi, et al.
Published: (2026)
by: Wu, Linzhi, et al.
Published: (2026)
Scalable Frameworks for Real-World Audio-Visual Speech Recognition
by: Kim, Sungnyun
Published: (2025)
by: Kim, Sungnyun
Published: (2025)
High-Fidelity Speech Enhancement via Discrete Audio Tokens
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
by: Dementyev, Artem, et al.
Published: (2024)
by: Dementyev, Artem, et al.
Published: (2024)
LSTMSE-Net: Long Short Term Speech Enhancement Network for Audio-visual Speech Enhancement
by: Jain, Arnav, et al.
Published: (2024)
by: Jain, Arnav, et al.
Published: (2024)
Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction
by: Wu, Wenxuan, et al.
Published: (2025)
by: Wu, Wenxuan, et al.
Published: (2025)
Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)
by: Behera, Avishkar, et al.
Published: (2025)
Variational Quantum Harmonizer: Generating Chord Progressions and Other Sonification Methods with the VQE Algorithm
by: Itaboraí, Paulo Vitor, et al.
Published: (2023)
by: Itaboraí, Paulo Vitor, et al.
Published: (2023)
Developing a Framework for Sonifying Variational Quantum Algorithms: Implications for Music Composition
by: Itaboraí, Paulo Vitor, et al.
Published: (2024)
by: Itaboraí, Paulo Vitor, et al.
Published: (2024)
aTENNuate: Optimized Real-time Speech Enhancement with Deep SSMs on Raw Audio
by: Pei, Yan Ru, et al.
Published: (2024)
by: Pei, Yan Ru, et al.
Published: (2024)
Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior
by: Yemini, Yochai, et al.
Published: (2025)
by: Yemini, Yochai, et al.
Published: (2025)
Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
by: Zhao, Shengkui, et al.
Published: (2021)
by: Zhao, Shengkui, et al.
Published: (2021)
Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
Towards Audio Codec-based Speech Separation
by: Yip, Jia Qi, et al.
Published: (2024)
by: Yip, Jia Qi, et al.
Published: (2024)
DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration
by: Serbest, Sanberk, et al.
Published: (2025)
by: Serbest, Sanberk, et al.
Published: (2025)
Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
by: Makarov, Rostislav, et al.
Published: (2025)
by: Makarov, Rostislav, et al.
Published: (2025)
SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
by: Yemini, Yochai, et al.
Published: (2026)
by: Yemini, Yochai, et al.
Published: (2026)
FSD50K-Solo: Automated Curation of Single-Source Sound Events
by: Yang, Ningyuan, et al.
Published: (2026)
by: Yang, Ningyuan, et al.
Published: (2026)
Separate in the Speech Chain: Cross-Modal Conditional Audio-Visual Target Speech Extraction
by: Mu, Zhaoxi, et al.
Published: (2024)
by: Mu, Zhaoxi, et al.
Published: (2024)
Predictive-Generative Drift Decomposition for Speech Enhancement and Separation
by: Richter, Julius, et al.
Published: (2026)
by: Richter, Julius, et al.
Published: (2026)
A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments
by: Khondkar, Md Jahangir Alam, et al.
Published: (2025)
by: Khondkar, Md Jahangir Alam, et al.
Published: (2025)
Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement
by: Yang, Yudong, et al.
Published: (2024)
by: Yang, Yudong, et al.
Published: (2024)
TF-MLPNet: Tiny Real-Time Neural Speech Separation
by: Itani, Malek, et al.
Published: (2025)
by: Itani, Malek, et al.
Published: (2025)
DDTSE: Discriminative Diffusion Model for Target Speech Extraction
by: Zhang, Leying, et al.
Published: (2023)
by: Zhang, Leying, et al.
Published: (2023)
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
by: Chen, Chengxin, et al.
Published: (2024)
by: Chen, Chengxin, et al.
Published: (2024)
Audio-Visual Speech Enhancement for Spatial Audio - Spatial-VisualVoice and the MAVE Database
by: Yaffe, Danielle, et al.
Published: (2025)
by: Yaffe, Danielle, et al.
Published: (2025)
Similar Items
-
Real-Time Audio-Visual Speech Enhancement Using Pre-trained Visual Representations
by: Ma, T. Aleksandra, et al.
Published: (2025) -
ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024) -
IoT-based Noise Monitoring using Mobile Nodes for Smart Cities
by: Manthina, Bhima Sankar, et al.
Published: (2025) -
Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025) -
AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile Phones
by: Liu, Xuanyu, et al.
Published: (2024)