Saved in:
| Main Authors: | Singh, Shubhr, Benetos, Emmanouil, Phan, Huy, Stowell, Dan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.03464 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024)
by: Bhattacharjee, Aditya, et al.
Published: (2024)
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
by: Steinmetz, Christian J., et al.
Published: (2024)
by: Steinmetz, Christian J., et al.
Published: (2024)
Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning
by: Tuncay, Ludovic, et al.
Published: (2025)
by: Tuncay, Ludovic, et al.
Published: (2025)
Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024)
by: Nolasco, Ines, et al.
Published: (2024)
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)
by: Papaioannou, Charilaos, et al.
Published: (2024)
Compressing Quaternion Convolutional Neural Networks for Audio Classification
by: Singh, Arshdeep, et al.
Published: (2025)
by: Singh, Arshdeep, et al.
Published: (2025)
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
by: Lin, Jiaju, et al.
Published: (2024)
by: Lin, Jiaju, et al.
Published: (2024)
Integrating IP Broadcasting with Audio Tags: Workflow and Challenges
by: Burchett-Vass, Rhys, et al.
Published: (2024)
by: Burchett-Vass, Rhys, et al.
Published: (2024)
Perceptual Musical Features for Interpretable Audio Tagging
by: Lyberatos, Vassilis, et al.
Published: (2023)
by: Lyberatos, Vassilis, et al.
Published: (2023)
Classification of Spontaneous and Scripted Speech for Multilingual Audio
by: Elisha, Shahar, et al.
Published: (2024)
by: Elisha, Shahar, et al.
Published: (2024)
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)
by: Plachouras, Christos, et al.
Published: (2025)
CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
by: Ma, Yinghao, et al.
Published: (2025)
by: Ma, Yinghao, et al.
Published: (2025)
Heterogeneous bimodal attention fusion for speech emotion recognition
by: Luo, Jiachen, et al.
Published: (2025)
by: Luo, Jiachen, et al.
Published: (2025)
Comprehensive Evaluation of CNN-Based Audio Tagging Models on Resource-Constrained Devices
by: Grau-Haro, Jordi, et al.
Published: (2025)
by: Grau-Haro, Jordi, et al.
Published: (2025)
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection
by: Liang, Jinhua, et al.
Published: (2024)
by: Liang, Jinhua, et al.
Published: (2024)
Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model
by: Huang, Jiawen, et al.
Published: (2024)
by: Huang, Jiawen, et al.
Published: (2024)
Fundamental Survey on Neuromorphic Based Audio Classification
by: Basu, Amlan, et al.
Published: (2025)
by: Basu, Amlan, et al.
Published: (2025)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
GraphMuse: A Library for Symbolic Music Graph Processing
by: Karystinaios, Emmanouil, et al.
Published: (2024)
by: Karystinaios, Emmanouil, et al.
Published: (2024)
Domain-Invariant Representation Learning of Bird Sounds
by: Moummad, Ilyass, et al.
Published: (2024)
by: Moummad, Ilyass, et al.
Published: (2024)
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks
by: Atif, Youness
Published: (2025)
by: Atif, Youness
Published: (2025)
In-the-wild Audio Spatialization with Flexible Text-guided Localization
by: Pan, Tianrui, et al.
Published: (2025)
by: Pan, Tianrui, et al.
Published: (2025)
HyperPotter: Spell the Charm of High-Order Interactions in Audio Deepfake Detection
by: Wen, Qing, et al.
Published: (2026)
by: Wen, Qing, et al.
Published: (2026)
RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
by: Chang, Sungkyun, et al.
Published: (2025)
by: Chang, Sungkyun, et al.
Published: (2025)
BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics
by: Rauch, Lukas, et al.
Published: (2024)
by: Rauch, Lukas, et al.
Published: (2024)
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024)
by: Dawn, Aditya, et al.
Published: (2024)
Enhancing Partially Spoofed Audio Localization with Boundary-aware Attention Mechanism
by: Zhong, Jiafeng, et al.
Published: (2024)
by: Zhong, Jiafeng, et al.
Published: (2024)
AFEN: Respiratory Disease Classification using Ensemble Learning
by: Nadkarni, Rahul, et al.
Published: (2024)
by: Nadkarni, Rahul, et al.
Published: (2024)
SpectroStream: A Versatile Neural Codec for General Audio
by: Li, Yunpeng, et al.
Published: (2025)
by: Li, Yunpeng, et al.
Published: (2025)
Temporal Information Reconstruction and Non-Aligned Residual in Spiking Neural Networks for Speech Classification
by: Zhang, Qi, et al.
Published: (2024)
by: Zhang, Qi, et al.
Published: (2024)
Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics
by: Pathak, Shreyansh, et al.
Published: (2025)
by: Pathak, Shreyansh, et al.
Published: (2025)
Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge
by: Stowell, Dan, et al.
Published: (2018)
by: Stowell, Dan, et al.
Published: (2018)
4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification
by: Berg, Andrew P., et al.
Published: (2025)
by: Berg, Andrew P., et al.
Published: (2025)
ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation
by: Feng, Tiantian, et al.
Published: (2024)
by: Feng, Tiantian, et al.
Published: (2024)
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
by: Wu, Tung-Yu, et al.
Published: (2024)
by: Wu, Tung-Yu, et al.
Published: (2024)
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?
by: Liu, Tianchi, et al.
Published: (2024)
by: Liu, Tianchi, et al.
Published: (2024)
Towards Leveraging Contrastively Pretrained Neural Audio Embeddings for Recommender Tasks
by: Grötschla, Florian, et al.
Published: (2024)
by: Grötschla, Florian, et al.
Published: (2024)
Domain Adaptation Method and Modality Gap Impact in Audio-Text Models for Prototypical Sound Classification
by: Acevedo, Emiliano, et al.
Published: (2025)
by: Acevedo, Emiliano, et al.
Published: (2025)
Audio Deepfake Detection in the Age of Advanced Text-to-Speech models
by: Singh, Robin, et al.
Published: (2026)
by: Singh, Robin, et al.
Published: (2026)
Similar Items
-
GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024) -
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
by: Steinmetz, Christian J., et al.
Published: (2024) -
Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning
by: Tuncay, Ludovic, et al.
Published: (2025) -
Acoustic identification of individual animals with hierarchical contrastive learning
by: Nolasco, Ines, et al.
Published: (2024) -
LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)