Saved in:
| Main Authors: | Choi, Ryuhaerang, Chatterjee, Soumyajit, Spathis, Dimitris, Lee, Sung-Ju, Kawsar, Fahim, Malekzadeh, Mohammad |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.23008 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio Classification
by: Lee, Dongheon, et al.
Published: (2024)
by: Lee, Dongheon, et al.
Published: (2024)
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024)
by: Wang, Helin, et al.
Published: (2024)
ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds
by: Wijngaard, Gijs, et al.
Published: (2024)
by: Wijngaard, Gijs, et al.
Published: (2024)
DnR-nonverbal: Cinematic Audio Source Separation Dataset Containing Non-Verbal Sounds
by: Hasumi, Takuya, et al.
Published: (2025)
by: Hasumi, Takuya, et al.
Published: (2025)
Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)
by: Tian, Jingguang, et al.
Published: (2025)
AudioSpa: Spatializing Sound Events with Text
by: Feng, Linfeng, et al.
Published: (2025)
by: Feng, Linfeng, et al.
Published: (2025)
Region-Specific Audio Tagging for Spatial Sound
by: Zhao, Jinzheng, et al.
Published: (2025)
by: Zhao, Jinzheng, et al.
Published: (2025)
The Sounds of Home: A Speech-Removed Residential Audio Dataset for Sound Event Detection
by: Bibbó, Gabriel, et al.
Published: (2024)
by: Bibbó, Gabriel, et al.
Published: (2024)
Correlation of Fréchet Audio Distance With Human Perception of Environmental Audio Is Embedding Dependant
by: Tailleur, Modan, et al.
Published: (2024)
by: Tailleur, Modan, et al.
Published: (2024)
AudioCIL: A Python Toolbox for Audio Class-Incremental Learning with Multiple Scenes
by: Xu, Qisheng, et al.
Published: (2024)
by: Xu, Qisheng, et al.
Published: (2024)
Class-Incremental Learning for Multi-Label Audio Classification
by: Mulimani, Manjunath, et al.
Published: (2024)
by: Mulimani, Manjunath, et al.
Published: (2024)
AudioBERTScore: Objective Evaluation of Environmental Sound Synthesis Based on Similarity of Audio embedding Sequences
by: Kishi, Minoru, et al.
Published: (2025)
by: Kishi, Minoru, et al.
Published: (2025)
SoundBeam meets M2D: Target Sound Extraction with Audio Foundation Model
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)
by: Hernandez-Olivan, Carlos, et al.
Published: (2024)
ASPED: An Audio Dataset for Detecting Pedestrians
by: Seshadri, Pavan, et al.
Published: (2023)
by: Seshadri, Pavan, et al.
Published: (2023)
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset
by: Du, Jiawei, et al.
Published: (2024)
by: Du, Jiawei, et al.
Published: (2024)
Effective Pre-Training of Audio Transformers for Sound Event Detection
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
by: Xie, Zeyu, et al.
Published: (2023)
by: Xie, Zeyu, et al.
Published: (2023)
Sound Check: Auditing Audio Datasets
by: Agnew, William, et al.
Published: (2024)
by: Agnew, William, et al.
Published: (2024)
Online Single-Channel Audio-Based Sound Speed Estimation for Robust Multi-Channel Audio Control
by: Fuglsig, Andreas Jonas, et al.
Published: (2026)
by: Fuglsig, Andreas Jonas, et al.
Published: (2026)
Robust Audio Tagging under Class-wise Supervision Unreliability
by: Hou, Yuanbo, et al.
Published: (2026)
by: Hou, Yuanbo, et al.
Published: (2026)
Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset
by: Ramoneda, Pedro, et al.
Published: (2024)
by: Ramoneda, Pedro, et al.
Published: (2024)
A Generalist Audio Foundation Model for Comprehensive Body Sound Auscultation
by: Wang, Pingjie, et al.
Published: (2024)
by: Wang, Pingjie, et al.
Published: (2024)
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction
by: Saijo, Kohei, et al.
Published: (2024)
by: Saijo, Kohei, et al.
Published: (2024)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
Listen, Analyze, and Adapt to Learn New Attacks: An Exemplar-Free Class Incremental Learning Method for Audio Deepfake Source Tracing
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text
by: Bang, Hayeon, et al.
Published: (2024)
by: Bang, Hayeon, et al.
Published: (2024)
Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
by: Qian, Xinyuan, et al.
Published: (2024)
by: Qian, Xinyuan, et al.
Published: (2024)
UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Exploring Perceptual Audio Quality Measurement on Stereo Processing Using the Open Dataset of Audio Quality
by: Delgado, Pablo M., et al.
Published: (2025)
by: Delgado, Pablo M., et al.
Published: (2025)
E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models
by: Dong, Jiaheng, et al.
Published: (2025)
by: Dong, Jiaheng, et al.
Published: (2025)
Audio-Language Datasets of Scenes and Events: A Survey
by: Wijngaard, Gijs, et al.
Published: (2024)
by: Wijngaard, Gijs, et al.
Published: (2024)
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
by: Müller, Nicolas M., et al.
Published: (2024)
by: Müller, Nicolas M., et al.
Published: (2024)
DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
by: Li, Baihan, et al.
Published: (2024)
by: Li, Baihan, et al.
Published: (2024)
Generating Diverse Audio-Visual 360 Soundscapes for Sound Event Localization and Detection
by: Roman, Adrian S., et al.
Published: (2025)
by: Roman, Adrian S., et al.
Published: (2025)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
Construction and Analysis of Impression Caption Dataset for Environmental Sounds
by: Okamoto, Yuki, et al.
Published: (2024)
by: Okamoto, Yuki, et al.
Published: (2024)
A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?
by: Choi, Kahyun, et al.
Published: (2024)
by: Choi, Kahyun, et al.
Published: (2024)
AFT: An Exemplar-Free Class Incremental Learning Method for Environmental Sound Classification
by: Chen, Xinyi, et al.
Published: (2025)
by: Chen, Xinyi, et al.
Published: (2025)
Evaluating CNN with Stacked Feature Representations and Audio Spectrogram Transformer Models for Sound Classification
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
Similar Items
-
DeFT-Mamba: Universal Multichannel Sound Separation and Polyphonic Audio Classification
by: Lee, Dongheon, et al.
Published: (2024) -
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024) -
ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds
by: Wijngaard, Gijs, et al.
Published: (2024) -
DnR-nonverbal: Cinematic Audio Source Separation Dataset Containing Non-Verbal Sounds
by: Hasumi, Takuya, et al.
Published: (2025) -
Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)