:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Santos, Arthur N. dos, Masiero, Bruno S., Mateus, Túlio C. L.
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2404.14564
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Survey on 30+ Years of Automatic Singing Assessment and Singing Information Processing
by: Santos, Arthur N. dos, et al.
Published: (2026)

Development of the Listening in Spatialized Noise-Sentences (LiSN-S) Test in Brazilian Portuguese: Presentation Software, Speech Stimuli, and Sentence Equivalence
by: Masiero, Bruno S., et al.
Published: (2024)

An experiment on an automated literature survey of data-driven speech enhancement methods
by: Santos, Arthur dos, et al.
Published: (2023)

Online Single-Channel Audio-Based Sound Speed Estimation for Robust Multi-Channel Audio Control
by: Fuglsig, Andreas Jonas, et al.
Published: (2026)

Speaker Distance Estimation in Enclosures from Single-Channel Audio
by: Neri, Michael, et al.
Published: (2024)

Unsupervised Single-Channel Audio Separation with Diffusion Source Priors
by: Shi, Runwu, et al.
Published: (2025)

RPRA-ADD: Forgery Trace Enhancement-Driven Audio Deepfake Detection
by: Fu, Ruibo, et al.
Published: (2025)

Study of Lightweight Transformer Architectures for Single-Channel Speech Enhancement
by: Zhao, Haixin, et al.
Published: (2025)

w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
by: Santos, Orlem Lima dos, et al.
Published: (2023)

Quantifying Spatial Audio Quality Impairment
by: Watcharasupat, Karn N., et al.
Published: (2023)

Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)

ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds
by: Wijngaard, Gijs, et al.
Published: (2024)

Exploring Perceptual Audio Quality Measurement on Stereo Processing Using the Open Dataset of Audio Quality
by: Delgado, Pablo M., et al.
Published: (2025)

Can Large Language Models Understand Spatial Audio?
by: Tang, Changli, et al.
Published: (2024)

Universal Spatial Audio Transcoder
by: Sagasti, Amaia, et al.
Published: (2024)

FlexIO: Flexible Single- and Multi-Channel Speech Separation and Enhancement
by: Masuyama, Yoshiki, et al.
Published: (2025)

SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing
by: Hu, Jinbo, et al.
Published: (2025)

AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)

Region-Specific Audio Tagging for Spatial Sound
by: Zhao, Jinzheng, et al.
Published: (2025)

The Extended SONICOM HRTF Dataset and Spatial Audio Metrics Toolbox
by: Poole, Katarina C., et al.
Published: (2025)

A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds
by: Xu, Xuenan, et al.
Published: (2024)

Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)

Multi-modal Speech Enhancement with Limited Electromyography Channels
by: Feng, Fuyuan, et al.
Published: (2025)

Attention-Based Beamformer For Multi-Channel Speech Enhancement
by: Bai, Jinglin, et al.
Published: (2024)

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)

Past, Present, and Future of Spatial Audio and Room Acoustics
by: Koyama, Shoichi, et al.
Published: (2025)

Towards Spatial Audio Understanding via Question Answering
by: Sudarsanam, Parthasaarathy, et al.
Published: (2025)

ASAudio: A Survey of Advanced Spatial Audio Research
by: Zhu, Zhiyuan, et al.
Published: (2025)

Exploring Differences between Human Perception and Model Inference in Audio Event Recognition
by: Tan, Yizhou, et al.
Published: (2024)

Stereo Audio Rendering for Personal Sound Zones Using a Binaural Spatially Adaptive Neural Network (BSANN)
by: Jiang, Hao, et al.
Published: (2026)

FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching
by: Jung, Chaeyoung, et al.
Published: (2024)

Audio Enhancement from Multiple Crowdsourced Recordings: A Simple and Effective Baseline
by: Aziz, Shiran, et al.
Published: (2024)

ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)

PrimeK-Net: Multi-scale Spectral Learning via Group Prime-Kernel Convolutional Neural Networks for Single Channel Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2025)

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement
by: Hao, Xiang, et al.
Published: (2020)

Geometry-Constrained EEG Channel Selection for Brain-Assisted Speech Enhancement
by: Zuo, Keying, et al.
Published: (2024)

Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)

UniAudio: An Audio Foundation Model Toward Universal Audio Generation
by: Yang, Dongchao, et al.
Published: (2023)

Pengi: An Audio Language Model for Audio Tasks
by: Deshmukh, Soham, et al.
Published: (2023)

xLSTM-SENet: xLSTM for Single-Channel Speech Enhancement
by: Kühne, Nikolai Lund, et al.
Published: (2025)