Saved in:
| Main Authors: | Lu, Jiajun, Huang, Wei, Zhang, Hao |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.09522 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Future Full-Ocean Deep SSPs Prediction based on Hierarchical Long Short-Term Memory Neural Networks
by: Lu, Jiajun, et al.
Published: (2023)
by: Lu, Jiajun, et al.
Published: (2023)
STNet: Prediction of Underwater Sound Speed Profiles with An Advanced Semi-Transformer Neural Network
by: Huang, Wei, et al.
Published: (2025)
by: Huang, Wei, et al.
Published: (2025)
FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020)
by: Hao, Xiang, et al.
Published: (2020)
A Multimodal Data Fusion Attention-Empowered Generative Adversarial Network for Real Time 3D Underwater Sound Speed Field Construction
by: Huang, Wei, et al.
Published: (2025)
by: Huang, Wei, et al.
Published: (2025)
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)
by: Lu, Ye-Xin, et al.
Published: (2024)
PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts
by: Qi, Tianhua, et al.
Published: (2025)
by: Qi, Tianhua, et al.
Published: (2025)
Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Decomposing the Influence of Physical Acoustic Modeling on Neural Personal Sound Zone Rendering: An Ablation Study
by: Jiang, Hao, et al.
Published: (2026)
by: Jiang, Hao, et al.
Published: (2026)
Classification of Heart Sounds Using Multi-Branch Deep Convolutional Network and LSTM-CNN
by: Latifi, Seyed Amir, et al.
Published: (2024)
by: Latifi, Seyed Amir, et al.
Published: (2024)
Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution
by: Yu, Chin-Yun, et al.
Published: (2022)
by: Yu, Chin-Yun, et al.
Published: (2022)
Frequency-Based Alignment of EEG and Audio Signals Using Contrastive Learning and SincNet for Auditory Attention Detection
by: Liao, Yuan, et al.
Published: (2025)
by: Liao, Yuan, et al.
Published: (2025)
PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion
by: Qi, Tianhua, et al.
Published: (2024)
by: Qi, Tianhua, et al.
Published: (2024)
Towards Realistic Emotional Voice Conversion using Controllable Emotional Intensity
by: Qi, Tianhua, et al.
Published: (2024)
by: Qi, Tianhua, et al.
Published: (2024)
Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)
by: Wang, Kuan-Chen, et al.
Published: (2024)
Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)
by: Zhang, Wangyou, et al.
Published: (2025)
Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
SSM2Mel: State Space Model to Reconstruct Mel Spectrogram from the EEG
by: Fan, Cunhang, et al.
Published: (2025)
by: Fan, Cunhang, et al.
Published: (2025)
How Does Instrumental Music Help SingFake Detection?
by: Chen, Xuanjun, et al.
Published: (2025)
by: Chen, Xuanjun, et al.
Published: (2025)
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization
by: Gao, Xiaoxue, et al.
Published: (2024)
by: Gao, Xiaoxue, et al.
Published: (2024)
Reverse Engineering of Music Mixing Graphs with Differentiable Processors and Iterative Pruning
by: Lee, Sungho, et al.
Published: (2025)
by: Lee, Sungho, et al.
Published: (2025)
Joint Fullband-Subband Modeling for High-Resolution SingFake Detection
by: Chen, Xuanjun, et al.
Published: (2026)
by: Chen, Xuanjun, et al.
Published: (2026)
Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
by: Ahmadiadli, Yasaman, et al.
Published: (2025)
by: Ahmadiadli, Yasaman, et al.
Published: (2025)
LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2024)
by: Yan, Haoyin, et al.
Published: (2024)
Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
by: Meng, Hanyu, et al.
Published: (2025)
by: Meng, Hanyu, et al.
Published: (2025)
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)
by: Gu, Yicheng, et al.
Published: (2024)
Toward Universal Speech Enhancement for Diverse Input Conditions
by: Zhang, Wangyou, et al.
Published: (2023)
by: Zhang, Wangyou, et al.
Published: (2023)
Aliasing-Free Neural Audio Synthesis
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Machine Learning in Acoustics: A Review and Open-Source Repository
by: McCarthy, Ryan A., et al.
Published: (2025)
by: McCarthy, Ryan A., et al.
Published: (2025)
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
by: Yao, Shengshi, et al.
Published: (2025)
by: Yao, Shengshi, et al.
Published: (2025)
Self-supervised speech representation and contextual text embedding for match-mismatch classification with EEG recording
by: Wang, Bo, et al.
Published: (2024)
by: Wang, Bo, et al.
Published: (2024)
Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency
by: Tang, Chenyu, et al.
Published: (2023)
by: Tang, Chenyu, et al.
Published: (2023)
Intelligent Fault Diagnosis of Type and Severity in Low-Frequency, Low Bit-Depth Signals
by: Spadini, Tito, et al.
Published: (2024)
by: Spadini, Tito, et al.
Published: (2024)
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility
by: Liu, Xiaoyu, et al.
Published: (2024)
by: Liu, Xiaoyu, et al.
Published: (2024)
Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns
by: Drossos, Konstantinos, et al.
Published: (2025)
by: Drossos, Konstantinos, et al.
Published: (2025)
BRUDEX Database: Binaural Room Impulse Responses with Uniformly Distributed External Microphones
by: Fejgin, Daniel, et al.
Published: (2023)
by: Fejgin, Daniel, et al.
Published: (2023)
Exploiting an External Microphone for Binaural RTF-Vector-Based Direction of Arrival Estimation for Multiple Speakers
by: Fejgin, Daniel, et al.
Published: (2023)
by: Fejgin, Daniel, et al.
Published: (2023)
String Sound Synthesizer on GPU-accelerated Finite Difference Scheme
by: Lee, Jin Woo, et al.
Published: (2023)
by: Lee, Jin Woo, et al.
Published: (2023)
Online Similarity-and-Independence-Aware Beamformer for Low-latency Target Sound Extraction
by: Hiroe, Atsuo
Published: (2023)
by: Hiroe, Atsuo
Published: (2023)
Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion
by: Nguyen, Phuc Duc, et al.
Published: (2023)
by: Nguyen, Phuc Duc, et al.
Published: (2023)
EchoScan: Scanning Complex Room Geometries via Acoustic Echoes
by: Yeon, Inmo, et al.
Published: (2023)
by: Yeon, Inmo, et al.
Published: (2023)
Similar Items
-
Future Full-Ocean Deep SSPs Prediction based on Hierarchical Long Short-Term Memory Neural Networks
by: Lu, Jiajun, et al.
Published: (2023) -
STNet: Prediction of Underwater Sound Speed Profiles with An Advanced Semi-Transformer Neural Network
by: Huang, Wei, et al.
Published: (2025) -
FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020) -
A Multimodal Data Fusion Attention-Empowered Generative Adversarial Network for Real Time 3D Underwater Sound Speed Field Construction
by: Huang, Wei, et al.
Published: (2025) -
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)