:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lu, Jiajun, Huang, Wei, Zhang, Hao
Format:	Preprint
Published:	2023
Subjects:	Sound Audio and Speech Processing Signal Processing
Online Access:	https://arxiv.org/abs/2310.09522
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Future Full-Ocean Deep SSPs Prediction based on Hierarchical Long Short-Term Memory Neural Networks
by: Lu, Jiajun, et al.
Published: (2023)

STNet: Prediction of Underwater Sound Speed Profiles with An Advanced Semi-Transformer Neural Network
by: Huang, Wei, et al.
Published: (2025)

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020)

A Multimodal Data Fusion Attention-Empowered Generative Adversarial Network for Real Time 3D Underwater Sound Speed Field Construction
by: Huang, Wei, et al.
Published: (2025)

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction
by: Lu, Ye-Xin, et al.
Published: (2024)

PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts
by: Qi, Tianhua, et al.
Published: (2025)

Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling
by: Gu, Yicheng, et al.
Published: (2025)

Decomposing the Influence of Physical Acoustic Modeling on Neural Personal Sound Zone Rendering: An Ablation Study
by: Jiang, Hao, et al.
Published: (2026)

Classification of Heart Sounds Using Multi-Branch Deep Convolutional Network and LSTM-CNN
by: Latifi, Seyed Amir, et al.
Published: (2024)

Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution
by: Yu, Chin-Yun, et al.
Published: (2022)

Frequency-Based Alignment of EEG and Audio Signals Using Contrastive Learning and SincNet for Auditory Attention Detection
by: Liao, Yuan, et al.
Published: (2025)

PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion
by: Qi, Tianhua, et al.
Published: (2024)

Towards Realistic Emotional Voice Conversion using Controllable Emotional Intensity
by: Qi, Tianhua, et al.
Published: (2024)

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition
by: Wang, Kuan-Chen, et al.
Published: (2024)

Lessons Learned from the URGENT 2024 Speech Enhancement Challenge
by: Zhang, Wangyou, et al.
Published: (2025)

Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)

SSM2Mel: State Space Model to Reconstruct Mel Spectrogram from the EEG
by: Fan, Cunhang, et al.
Published: (2025)

How Does Instrumental Music Help SingFake Detection?
by: Chen, Xuanjun, et al.
Published: (2025)

Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization
by: Gao, Xiaoxue, et al.
Published: (2024)

Reverse Engineering of Music Mixing Graphs with Differentiable Processors and Iterative Pruning
by: Lee, Sungho, et al.
Published: (2025)

Joint Fullband-Subband Modeling for High-Resolution SingFake Detection
by: Chen, Xuanjun, et al.
Published: (2026)

Beyond Identity: A Generalizable Approach for Deepfake Audio Detection
by: Ahmadiadli, Yasaman, et al.
Published: (2025)

LiSenNet: Lightweight Sub-band and Dual-Path Modeling for Real-Time Speech Enhancement
by: Yan, Haoyin, et al.
Published: (2024)

Adaptive Per-Channel Energy Normalization Front-end for Robust Audio Signal Processing
by: Meng, Hanyu, et al.
Published: (2025)

An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)

Toward Universal Speech Enhancement for Diverse Input Conditions
by: Zhang, Wangyou, et al.
Published: (2023)

Aliasing-Free Neural Audio Synthesis
by: Gu, Yicheng, et al.
Published: (2025)

Machine Learning in Acoustics: A Review and Open-Source Repository
by: McCarthy, Ryan A., et al.
Published: (2025)

SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
by: Yao, Shengshi, et al.
Published: (2025)

Self-supervised speech representation and contextual text embedding for match-mismatch classification with EEG recording
by: Wang, Bo, et al.
Published: (2024)

Ultrasensitive Textile Strain Sensors Redefine Wearable Silent Speech Interfaces with High Machine Learning Efficiency
by: Tang, Chenyu, et al.
Published: (2023)

Intelligent Fault Diagnosis of Type and Severity in Low-Frequency, Low Bit-Depth Signals
by: Spadini, Tito, et al.
Published: (2024)

Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility
by: Liu, Xiaoyu, et al.
Published: (2024)

Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns
by: Drossos, Konstantinos, et al.
Published: (2025)

BRUDEX Database: Binaural Room Impulse Responses with Uniformly Distributed External Microphones
by: Fejgin, Daniel, et al.
Published: (2023)

Exploiting an External Microphone for Binaural RTF-Vector-Based Direction of Arrival Estimation for Multiple Speakers
by: Fejgin, Daniel, et al.
Published: (2023)

String Sound Synthesizer on GPU-accelerated Finite Difference Scheme
by: Lee, Jin Woo, et al.
Published: (2023)

Online Similarity-and-Independence-Aware Beamformer for Low-latency Target Sound Extraction
by: Hiroe, Atsuo
Published: (2023)

Acousto-optic reconstruction of exterior sound field based on concentric circle sampling with circular harmonic expansion
by: Nguyen, Phuc Duc, et al.
Published: (2023)

EchoScan: Scanning Complex Room Geometries via Acoustic Echoes
by: Yeon, Inmo, et al.
Published: (2023)