Saved in:
| Main Authors: | Berg, Andrew P., Zhang, Qian, Wang, Mia Y. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.23782 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024)
by: Dawn, Aditya, et al.
Published: (2024)
A Multiclass Acoustic Dataset and Interactive Tool for Analyzing Drone Signatures in Real-World Environments
by: Wang, Mia Y., et al.
Published: (2025)
by: Wang, Mia Y., et al.
Published: (2025)
Representation-Regularized Convolutional Audio Transformer for Audio Understanding
by: Han, Bing, et al.
Published: (2026)
by: Han, Bing, et al.
Published: (2026)
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
by: Liu, Bei, et al.
Published: (2024)
by: Liu, Bei, et al.
Published: (2024)
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
by: Zheng, Xinhu, et al.
Published: (2024)
by: Zheng, Xinhu, et al.
Published: (2024)
Fundamental Survey on Neuromorphic Based Audio Classification
by: Basu, Amlan, et al.
Published: (2025)
by: Basu, Amlan, et al.
Published: (2025)
SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training
by: Mei, Xinhao, et al.
Published: (2026)
by: Mei, Xinhao, et al.
Published: (2026)
Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods
by: Zhou, Xuanru, et al.
Published: (2026)
by: Zhou, Xuanru, et al.
Published: (2026)
ModalityMirror: Improving Audio Classification in Modality Heterogeneity Federated Learning with Multimodal Distillation
by: Feng, Tiantian, et al.
Published: (2024)
by: Feng, Tiantian, et al.
Published: (2024)
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models
by: Song, Zirui, et al.
Published: (2025)
by: Song, Zirui, et al.
Published: (2025)
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
by: Zhang, Zirui, et al.
Published: (2024)
by: Zhang, Zirui, et al.
Published: (2024)
AND: Audio Network Dissection for Interpreting Deep Acoustic Models
by: Wu, Tung-Yu, et al.
Published: (2024)
by: Wu, Tung-Yu, et al.
Published: (2024)
AudioRouter: Data Efficient Audio Understanding via RL based Dual Reasoning
by: Chen, Liyang, et al.
Published: (2026)
by: Chen, Liyang, et al.
Published: (2026)
Music Genre Classification: A Comparative Analysis of Classical Machine Learning and Deep Learning Approaches
by: Prajuli, Sachin, et al.
Published: (2026)
by: Prajuli, Sachin, et al.
Published: (2026)
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation
by: Luong, Manh, et al.
Published: (2024)
by: Luong, Manh, et al.
Published: (2024)
LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging
by: Singh, Shubhr, et al.
Published: (2025)
by: Singh, Shubhr, et al.
Published: (2025)
BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics
by: Rauch, Lukas, et al.
Published: (2024)
by: Rauch, Lukas, et al.
Published: (2024)
Raw Audio Classification with Cosine Convolutional Neural Network (CosCovNN)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
by: Haque, Kazi Nazmul, et al.
Published: (2024)
Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting
by: Wang, Zhenyu, et al.
Published: (2024)
by: Wang, Zhenyu, et al.
Published: (2024)
JASTIN: Aligning LLMs for Zero-Shot Audio and Speech Evaluation via Natural Language Instructions
by: Zhang, Leying, et al.
Published: (2026)
by: Zhang, Leying, et al.
Published: (2026)
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance
by: Milling, Manuel, et al.
Published: (2024)
by: Milling, Manuel, et al.
Published: (2024)
DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization
by: Lee, Geonyoung, et al.
Published: (2025)
by: Lee, Geonyoung, et al.
Published: (2025)
Domain Adaptation Method and Modality Gap Impact in Audio-Text Models for Prototypical Sound Classification
by: Acevedo, Emiliano, et al.
Published: (2025)
by: Acevedo, Emiliano, et al.
Published: (2025)
A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)
by: Pham, Lam, et al.
Published: (2024)
by: Pham, Lam, et al.
Published: (2024)
Audio Mamba: Pretrained Audio State Space Model For Audio Tagging
by: Lin, Jiaju, et al.
Published: (2024)
by: Lin, Jiaju, et al.
Published: (2024)
Compressing Quaternion Convolutional Neural Networks for Audio Classification
by: Singh, Arshdeep, et al.
Published: (2025)
by: Singh, Arshdeep, et al.
Published: (2025)
Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments
by: Xu, Shitong, et al.
Published: (2025)
by: Xu, Shitong, et al.
Published: (2025)
Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models
by: Pham, Lam, et al.
Published: (2024)
by: Pham, Lam, et al.
Published: (2024)
Audio Atlas: Visualizing and Exploring Audio Datasets
by: Lanzendörfer, Luca A., et al.
Published: (2024)
by: Lanzendörfer, Luca A., et al.
Published: (2024)
Beyond Silence: Bias Analysis through Loss and Asymmetric Approach in Audio Anti-Spoofing
by: Shim, Hye-jin, et al.
Published: (2024)
by: Shim, Hye-jin, et al.
Published: (2024)
FreeAudio: Training-Free Timing Planning for Controllable Long-Form Text-to-Audio Generation
by: Jiang, Yuxuan, et al.
Published: (2025)
by: Jiang, Yuxuan, et al.
Published: (2025)
DDFAD: Dataset Distillation Framework for Audio Data
by: Jiang, Wenbo, et al.
Published: (2024)
by: Jiang, Wenbo, et al.
Published: (2024)
Arabic Music Classification and Generation using Deep Learning
by: Elshaarawy, Mohamed, et al.
Published: (2024)
by: Elshaarawy, Mohamed, et al.
Published: (2024)
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks
by: Atif, Youness
Published: (2025)
by: Atif, Youness
Published: (2025)
Audio Spatially-Guided Fusion for Audio-Visual Navigation
by: Zhou, Xinyu, et al.
Published: (2026)
by: Zhou, Xinyu, et al.
Published: (2026)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
Toward Improving Synthetic Audio Spoofing Detection Robustness via Meta-Learning and Disentangled Training With Adversarial Examples
by: Wang, Zhenyu, et al.
Published: (2024)
by: Wang, Zhenyu, et al.
Published: (2024)
Neural-Enhanced Dynamic Range Compression Inversion: A Hybrid Approach for Restoring Audio Dynamics
by: Sun, Haoran, et al.
Published: (2024)
by: Sun, Haoran, et al.
Published: (2024)
AudioTurbo: Fast Text-to-Audio Generation with Rectified Diffusion
by: Zhao, Junqi, et al.
Published: (2025)
by: Zhao, Junqi, et al.
Published: (2025)
DreamAudio: Customized Text-to-Audio Generation with Diffusion Models
by: Yuan, Yi, et al.
Published: (2025)
by: Yuan, Yi, et al.
Published: (2025)
Similar Items
-
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024) -
A Multiclass Acoustic Dataset and Interactive Tool for Analyzing Drone Signatures in Real-World Environments
by: Wang, Mia Y., et al.
Published: (2025) -
Representation-Regularized Convolutional Audio Transformer for Audio Understanding
by: Han, Bing, et al.
Published: (2026) -
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
by: Liu, Bei, et al.
Published: (2024) -
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
by: Zheng, Xinhu, et al.
Published: (2024)