Saved in:
| Main Authors: | Ma, T. Aleksandra, Yin, Sile, Yang, Li-Chia, Zhang, Shuo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.21448 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Real-Time System for Audio-Visual Target Speech Enhancement
by: Ma, T. Aleksandra, et al.
Published: (2025)
by: Ma, T. Aleksandra, et al.
Published: (2025)
ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024)
by: Heydari, Mojtaba, et al.
Published: (2024)
IoT-based Noise Monitoring using Mobile Nodes for Smart Cities
by: Manthina, Bhima Sankar, et al.
Published: (2025)
by: Manthina, Bhima Sankar, et al.
Published: (2025)
Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025)
by: Tripathi, Rajeshwar, et al.
Published: (2025)
NEUROSEC: FPGA-Based Neuromorphic Audio Security
by: Isik, Murat, et al.
Published: (2024)
by: Isik, Murat, et al.
Published: (2024)
EgoTrigger: Toward Audio-Driven Image Capture for Human Memory Enhancement in All-Day Energy-Efficient Smart Glasses
by: Paruchuri, Akshay, et al.
Published: (2025)
by: Paruchuri, Akshay, et al.
Published: (2025)
AcousAF: Acoustic Sensing-Based Atrial Fibrillation Detection System for Mobile Phones
by: Liu, Xuanyu, et al.
Published: (2024)
by: Liu, Xuanyu, et al.
Published: (2024)
Spike Encoding for Environmental Sound: A Comparative Benchmark
by: Larroza, Andres, et al.
Published: (2025)
by: Larroza, Andres, et al.
Published: (2025)
Acoustic Anomaly Detection on UAM Propeller Defect with Acoustic dataset for Crack of drone Propeller (ADCP)
by: Lee, Juho, et al.
Published: (2025)
by: Lee, Juho, et al.
Published: (2025)
An ambient denoising method based on multi-channel non-negative matrix factorization for wheezing detection
by: Muñoz-Montoro, Antonio J., et al.
Published: (2024)
by: Muñoz-Montoro, Antonio J., et al.
Published: (2024)
Bridging The Multi-Modality Gaps of Audio, Visual and Linguistic for Speech Enhancement
by: Lin, Meng-Ping, et al.
Published: (2025)
by: Lin, Meng-Ping, et al.
Published: (2025)
Techniques for Quantum-Computing-Aided Algorithmic Composition: Experiments in Rhythm, Timbre, Harmony, and Space
by: Dobrian, Christopher, et al.
Published: (2025)
by: Dobrian, Christopher, et al.
Published: (2025)
Intro to Quantum Harmony: Chords in Superposition
by: Dobrian, Christopher, et al.
Published: (2024)
by: Dobrian, Christopher, et al.
Published: (2024)
Scalable Frameworks for Real-World Audio-Visual Speech Recognition
by: Kim, Sungnyun
Published: (2025)
by: Kim, Sungnyun
Published: (2025)
Purification Before Fusion: Toward Mask-Free Speech Enhancement for Robust Audio-Visual Speech Recognition
by: Wu, Linzhi, et al.
Published: (2026)
by: Wu, Linzhi, et al.
Published: (2026)
Pre-training Feature Guided Diffusion Model for Speech Enhancement
by: Yang, Yiyuan, et al.
Published: (2024)
by: Yang, Yiyuan, et al.
Published: (2024)
Improving snore detection under limited dataset through harmonic/percussive source separation and convolutional neural networks
by: Gonzalez-Martinez, F. D., et al.
Published: (2024)
by: Gonzalez-Martinez, F. D., et al.
Published: (2024)
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation
by: Kim, Sungnyun, et al.
Published: (2025)
by: Kim, Sungnyun, et al.
Published: (2025)
Diffusion-Based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior
by: Yemini, Yochai, et al.
Published: (2025)
by: Yemini, Yochai, et al.
Published: (2025)
SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
by: Yemini, Yochai, et al.
Published: (2026)
by: Yemini, Yochai, et al.
Published: (2026)
Audio-Visual Speech Enhancement for Spatial Audio - Spatial-VisualVoice and the MAVE Database
by: Yaffe, Danielle, et al.
Published: (2025)
by: Yaffe, Danielle, et al.
Published: (2025)
High-Fidelity Speech Enhancement via Discrete Audio Tokens
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables
by: Dementyev, Artem, et al.
Published: (2024)
by: Dementyev, Artem, et al.
Published: (2024)
LSTMSE-Net: Long Short Term Speech Enhancement Network for Audio-visual Speech Enhancement
by: Jain, Arnav, et al.
Published: (2024)
by: Jain, Arnav, et al.
Published: (2024)
Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)
by: Behera, Avishkar, et al.
Published: (2025)
MoHAVE: Mixture of Hierarchical Audio-Visual Experts for Robust Speech Recognition
by: Kim, Sungnyun, et al.
Published: (2025)
by: Kim, Sungnyun, et al.
Published: (2025)
Variational Quantum Harmonizer: Generating Chord Progressions and Other Sonification Methods with the VQE Algorithm
by: Itaboraí, Paulo Vitor, et al.
Published: (2023)
by: Itaboraí, Paulo Vitor, et al.
Published: (2023)
Developing a Framework for Sonifying Variational Quantum Algorithms: Implications for Music Composition
by: Itaboraí, Paulo Vitor, et al.
Published: (2024)
by: Itaboraí, Paulo Vitor, et al.
Published: (2024)
aTENNuate: Optimized Real-time Speech Enhancement with Deep SSMs on Raw Audio
by: Pei, Yan Ru, et al.
Published: (2024)
by: Pei, Yan Ru, et al.
Published: (2024)
An Enhanced Audio Feature Tailored for Anomalous Sound Detection Based on Pre-trained Models
by: Zhong, Guirui, et al.
Published: (2025)
by: Zhong, Guirui, et al.
Published: (2025)
Efficient Adapter Tuning of Pre-trained Speech Models for Automatic Speaker Verification
by: Sang, Mufan, et al.
Published: (2024)
by: Sang, Mufan, et al.
Published: (2024)
Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses
by: Zhao, Shengkui, et al.
Published: (2021)
by: Zhao, Shengkui, et al.
Published: (2021)
Tracking Listener Attention: Gaze-Guided Audio-Visual Speech Enhancement Framework
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
by: Yang, Hsiang-Cheng, et al.
Published: (2026)
Two Heads Are Better Than One: Audio-Visual Speech Error Correction with Dual Hypotheses
by: Kim, Sungnyun, et al.
Published: (2025)
by: Kim, Sungnyun, et al.
Published: (2025)
Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction
by: Wu, Wenxuan, et al.
Published: (2025)
by: Wu, Wenxuan, et al.
Published: (2025)
Towards Audio Codec-based Speech Separation
by: Yip, Jia Qi, et al.
Published: (2024)
by: Yip, Jia Qi, et al.
Published: (2024)
Audio-Visual Feature Synchronization for Robust Speech Enhancement in Hearing Aids
by: Saleem, Nasir, et al.
Published: (2025)
by: Saleem, Nasir, et al.
Published: (2025)
Generative Pre-training for Speech with Flow Matching
by: Liu, Alexander H., et al.
Published: (2023)
by: Liu, Alexander H., et al.
Published: (2023)
Multimodal Speech Enhancement Using Burst Propagation
by: Raza, Mohsin, et al.
Published: (2022)
by: Raza, Mohsin, et al.
Published: (2022)
Similar Items
-
Real-Time System for Audio-Visual Target Speech Enhancement
by: Ma, T. Aleksandra, et al.
Published: (2025) -
ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion Model
by: Heydari, Mojtaba, et al.
Published: (2024) -
IoT-based Noise Monitoring using Mobile Nodes for Smart Cities
by: Manthina, Bhima Sankar, et al.
Published: (2025) -
Quantum Fourier Transform Based Denoising: Unitary Filtering for Enhanced Speech Clarity
by: Tripathi, Rajeshwar, et al.
Published: (2025) -
NEUROSEC: FPGA-Based Neuromorphic Audio Security
by: Isik, Murat, et al.
Published: (2024)