Saved in:
| Main Authors: | Sagasti, Amaia, Scaini, Davide, Arteaga, Daniel |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.04471 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Region-Specific Audio Tagging for Spatial Sound
by: Zhao, Jinzheng, et al.
Published: (2025)
by: Zhao, Jinzheng, et al.
Published: (2025)
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
by: Yang, Dongchao, et al.
Published: (2023)
by: Yang, Dongchao, et al.
Published: (2023)
Quantifying Spatial Audio Quality Impairment
by: Watcharasupat, Karn N., et al.
Published: (2023)
by: Watcharasupat, Karn N., et al.
Published: (2023)
AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)
by: Feng, Linfeng, et al.
Published: (2025)
Deep learning based spatial aliasing reduction in beamforming for audio capture
by: Guzik, Mateusz, et al.
Published: (2025)
by: Guzik, Mateusz, et al.
Published: (2025)
Can Large Language Models Understand Spatial Audio?
by: Tang, Changli, et al.
Published: (2024)
by: Tang, Changli, et al.
Published: (2024)
Past, Present, and Future of Spatial Audio and Room Acoustics
by: Koyama, Shoichi, et al.
Published: (2025)
by: Koyama, Shoichi, et al.
Published: (2025)
Towards Spatial Audio Understanding via Question Answering
by: Sudarsanam, Parthasaarathy, et al.
Published: (2025)
by: Sudarsanam, Parthasaarathy, et al.
Published: (2025)
ASAudio: A Survey of Advanced Spatial Audio Research
by: Zhu, Zhiyuan, et al.
Published: (2025)
by: Zhu, Zhiyuan, et al.
Published: (2025)
Sound event localization and detection based on crnn using rectangular filters and channel rotation data augmentation
by: Ronchini, Francesca, et al.
Published: (2020)
by: Ronchini, Francesca, et al.
Published: (2020)
The Extended SONICOM HRTF Dataset and Spatial Audio Metrics Toolbox
by: Poole, Katarina C., et al.
Published: (2025)
by: Poole, Katarina C., et al.
Published: (2025)
SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing
by: Hu, Jinbo, et al.
Published: (2025)
by: Hu, Jinbo, et al.
Published: (2025)
Room Impulse Response Generation Conditioned on Acoustic Parameters
by: Arellano, Silvia, et al.
Published: (2025)
by: Arellano, Silvia, et al.
Published: (2025)
Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
by: Kushwaha, Saksham Singh, et al.
Published: (2024)
VR-PTOLEMAIC: A Virtual Environment for the Perceptual Testing of Spatial Audio Algorithms
by: Ostan, Paolo, et al.
Published: (2025)
by: Ostan, Paolo, et al.
Published: (2025)
UniSep: Universal Target Audio Separation with Language Models at Scale
by: Wang, Yuanyuan, et al.
Published: (2025)
by: Wang, Yuanyuan, et al.
Published: (2025)
AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook
by: Chen, Yushen, et al.
Published: (2025)
by: Chen, Yushen, et al.
Published: (2025)
Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model
by: Santos, Arthur N. dos, et al.
Published: (2024)
by: Santos, Arthur N. dos, et al.
Published: (2024)
DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio Classification
by: Lee, Dongheon, et al.
Published: (2024)
by: Lee, Dongheon, et al.
Published: (2024)
SAVGBench: Benchmarking Spatially Aligned Audio-Video Generation
by: Shimada, Kazuki, et al.
Published: (2024)
by: Shimada, Kazuki, et al.
Published: (2024)
Audio Spatially-Guided Fusion for Audio-Visual Navigation
by: Zhou, Xinyu, et al.
Published: (2026)
by: Zhou, Xinyu, et al.
Published: (2026)
Room Impulse Response Synthesis via Differentiable Feedback Delay Networks for Efficient Spatial Audio Rendering
by: Gerami, Armin, et al.
Published: (2025)
by: Gerami, Armin, et al.
Published: (2025)
MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models
by: Gong, Yitian, et al.
Published: (2026)
by: Gong, Yitian, et al.
Published: (2026)
Stereo Audio Rendering for Personal Sound Zones Using a Binaural Spatially Adaptive Neural Network (BSANN)
by: Jiang, Hao, et al.
Published: (2026)
by: Jiang, Hao, et al.
Published: (2026)
Streaming Audio Transformers for Online Audio Tagging
by: Dinkel, Heinrich, et al.
Published: (2023)
by: Dinkel, Heinrich, et al.
Published: (2023)
Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)
by: Tian, Jingguang, et al.
Published: (2025)
Pengi: An Audio Language Model for Audio Tasks
by: Deshmukh, Soham, et al.
Published: (2023)
by: Deshmukh, Soham, et al.
Published: (2023)
AV-SSAN: Audio-Visual Selective DoA Estimation through Explicit Multi-Band Semantic-Spatial Alignment
by: Chen, Yu, et al.
Published: (2025)
by: Chen, Yu, et al.
Published: (2025)
Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
by: Berghi, Davide, et al.
Published: (2024)
by: Berghi, Davide, et al.
Published: (2024)
MACE: Leveraging Audio for Evaluating Audio Captioning Systems
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
by: Deshmukh, Soham, et al.
Published: (2024)
by: Deshmukh, Soham, et al.
Published: (2024)
Audio-Mind: An Auditable Agentic Framework for Audio Understanding
by: Wang, Yucheng, et al.
Published: (2026)
by: Wang, Yucheng, et al.
Published: (2026)
SemanticAudio: Audio Generation and Editing in Semantic Space
by: Dai, Zheqi, et al.
Published: (2026)
by: Dai, Zheqi, et al.
Published: (2026)
Speaker Distance Estimation in Enclosures from Single-Channel Audio
by: Neri, Michael, et al.
Published: (2024)
by: Neri, Michael, et al.
Published: (2024)
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training
by: Santos, Orlem Lima dos, et al.
Published: (2023)
by: Santos, Orlem Lima dos, et al.
Published: (2023)
SRC-gAudio: Sampling-Rate-Controlled Audio Generation
by: Li, Chenxing, et al.
Published: (2024)
by: Li, Chenxing, et al.
Published: (2024)
AudioLCM: Text-to-Audio Generation with Latent Consistency Models
by: Liu, Huadai, et al.
Published: (2024)
by: Liu, Huadai, et al.
Published: (2024)
ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)
by: Khanjani, Zahra, et al.
Published: (2024)
PAM: Prompting Audio-Language Models for Audio Quality Assessment
by: Deshmukh, Soham, et al.
Published: (2024)
by: Deshmukh, Soham, et al.
Published: (2024)
Do Music Source Separation Models Preserve Spatial Information in Binaural Audio?
by: Namballa, Richa, et al.
Published: (2025)
by: Namballa, Richa, et al.
Published: (2025)
Similar Items
-
Region-Specific Audio Tagging for Spatial Sound
by: Zhao, Jinzheng, et al.
Published: (2025) -
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
by: Yang, Dongchao, et al.
Published: (2023) -
Quantifying Spatial Audio Quality Impairment
by: Watcharasupat, Karn N., et al.
Published: (2023) -
AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025) -
Deep learning based spatial aliasing reduction in beamforming for audio capture
by: Guzik, Mateusz, et al.
Published: (2025)