:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sagasti, Amaia, Scaini, Davide, Arteaga, Daniel
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2405.04471
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Region-Specific Audio Tagging for Spatial Sound
by: Zhao, Jinzheng, et al.
Published: (2025)

UniAudio: An Audio Foundation Model Toward Universal Audio Generation
by: Yang, Dongchao, et al.
Published: (2023)

Quantifying Spatial Audio Quality Impairment
by: Watcharasupat, Karn N., et al.
Published: (2023)

AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)

Deep learning based spatial aliasing reduction in beamforming for audio capture
by: Guzik, Mateusz, et al.
Published: (2025)

Can Large Language Models Understand Spatial Audio?
by: Tang, Changli, et al.
Published: (2024)

Past, Present, and Future of Spatial Audio and Room Acoustics
by: Koyama, Shoichi, et al.
Published: (2025)

Towards Spatial Audio Understanding via Question Answering
by: Sudarsanam, Parthasaarathy, et al.
Published: (2025)

ASAudio: A Survey of Advanced Spatial Audio Research
by: Zhu, Zhiyuan, et al.
Published: (2025)

Sound event localization and detection based on crnn using rectangular filters and channel rotation data augmentation
by: Ronchini, Francesca, et al.
Published: (2020)

The Extended SONICOM HRTF Dataset and Spatial Audio Metrics Toolbox
by: Poole, Katarina C., et al.
Published: (2025)

SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing
by: Hu, Jinbo, et al.
Published: (2025)

Room Impulse Response Generation Conditioned on Acoustic Parameters
by: Arellano, Silvia, et al.
Published: (2025)

Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)

VR-PTOLEMAIC: A Virtual Environment for the Perceptual Testing of Spatial Audio Algorithms
by: Ostan, Paolo, et al.
Published: (2025)

UniSep: Universal Target Audio Separation with Language Models at Scale
by: Wang, Yuanyuan, et al.
Published: (2025)

AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook
by: Chen, Yushen, et al.
Published: (2025)

Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
by: Santos, Arthur N. dos, et al.
Published: (2024)

DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio Classification
by: Lee, Dongheon, et al.
Published: (2024)

SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation
by: Shimada, Kazuki, et al.
Published: (2024)

Audio Spatially-Guided Fusion for Audio-Visual Navigation
by: Zhou, Xinyu, et al.
Published: (2026)

Room Impulse Response Synthesis via Differentiable Feedback Delay Networks for Efficient Spatial Audio Rendering
by: Gerami, Armin, et al.
Published: (2025)

MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models
by: Gong, Yitian, et al.
Published: (2026)

Stereo Audio Rendering for Personal Sound Zones Using a Binaural Spatially Adaptive Neural Network (BSANN)
by: Jiang, Hao, et al.
Published: (2026)

Streaming Audio Transformers for Online Audio Tagging
by: Dinkel, Heinrich, et al.
Published: (2023)

Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)

Pengi: An Audio Language Model for Audio Tasks
by: Deshmukh, Soham, et al.
Published: (2023)

AV-SSAN: Audio-Visual Selective DoA Estimation through Explicit Multi-Band Semantic-Spatial Alignment
by: Chen, Yu, et al.
Published: (2025)

Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
by: Berghi, Davide, et al.
Published: (2024)

MACE: Leveraging Audio for Evaluating Audio Captioning Systems
by: Dixit, Satvik, et al.
Published: (2024)

Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
by: Deshmukh, Soham, et al.
Published: (2024)

Audio-Mind: An Auditable Agentic Framework for Audio Understanding
by: Wang, Yucheng, et al.
Published: (2026)

SemanticAudio: Audio Generation and Editing in Semantic Space
by: Dai, Zheqi, et al.
Published: (2026)

Speaker Distance Estimation in Enclosures from Single-Channel Audio
by: Neri, Michael, et al.
Published: (2024)

w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
by: Santos, Orlem Lima dos, et al.
Published: (2023)

SRC-gAudio: Sampling-Rate-Controlled Audio Generation
by: Li, Chenxing, et al.
Published: (2024)

AudioLCM: Text-to-Audio Generation with Latent Consistency Models
by: Liu, Huadai, et al.
Published: (2024)

ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)

PAM: Prompting Audio-Language Models for Audio Quality Assessment
by: Deshmukh, Soham, et al.
Published: (2024)

Do Music Source Separation Models Preserve Spatial Information in Binaural Audio?
by: Namballa, Richa, et al.
Published: (2025)