Saved in:
| Main Authors: | Douwes, Constance, Serizel, Romain |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.05080 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Energy Consumption Trends in Sound Event Detection Systems
by: Douwes, Constance, et al.
Published: (2024)
by: Douwes, Constance, et al.
Published: (2024)
The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN
by: Magron, Paul, et al.
Published: (2026)
by: Magron, Paul, et al.
Published: (2026)
Normalizing Energy Consumption for Hardware-Independent Evaluation
by: Douwes, Constance, et al.
Published: (2024)
by: Douwes, Constance, et al.
Published: (2024)
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
by: Passoni, Riccardo, et al.
Published: (2025)
by: Passoni, Riccardo, et al.
Published: (2025)
A benchmark of state-of-the-art sound event detection systems evaluated on synthetic soundscapes
by: Ronchini, Francesca, et al.
Published: (2022)
by: Ronchini, Francesca, et al.
Published: (2022)
Self-Supervised Learning for Few-Shot Bird Sound Classification
by: Moummad, Ilyass, et al.
Published: (2023)
by: Moummad, Ilyass, et al.
Published: (2023)
Regularized Contrastive Pre-training for Few-shot Bioacoustic Sound Detection
by: Moummad, Ilyass, et al.
Published: (2023)
by: Moummad, Ilyass, et al.
Published: (2023)
The impact of non-target events in synthetic soundscapes for sound event detection
by: Ronchini, Francesca, et al.
Published: (2021)
by: Ronchini, Francesca, et al.
Published: (2021)
DCASE 2024 Task 4: Sound Event Detection with Heterogeneous Data and Missing Labels
by: Cornell, Samuele, et al.
Published: (2024)
by: Cornell, Samuele, et al.
Published: (2024)
Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds
by: Moummad, Ilyass, et al.
Published: (2024)
by: Moummad, Ilyass, et al.
Published: (2024)
Posterior Transition Modeling for Unsupervised Diffusion-Based Speech Enhancement
by: Sadeghi, Mostafa, et al.
Published: (2025)
by: Sadeghi, Mostafa, et al.
Published: (2025)
Dynamic Gated Recurrent Neural Network for Compute-efficient Speech Enhancement
by: Cheng, Longbiao, et al.
Published: (2024)
by: Cheng, Longbiao, et al.
Published: (2024)
Performance and energy balance: a comprehensive study of state-of-the-art sound event detection systems
by: Ronchini, Francesca, et al.
Published: (2023)
by: Ronchini, Francesca, et al.
Published: (2023)
Diffusion-based Unsupervised Audio-visual Speech Enhancement
by: Ayilo, Jean-Eudes, et al.
Published: (2024)
by: Ayilo, Jean-Eudes, et al.
Published: (2024)
Metric Analysis for Spatial Semantic Segmentation of Sound Scenes
by: Mishra, Mayank, et al.
Published: (2025)
by: Mishra, Mayank, et al.
Published: (2025)
Frequency-Weighted Training Losses for Phoneme-Level DNN-based Speech Enhancement
by: Monir, Nasser-Eddine, et al.
Published: (2025)
by: Monir, Nasser-Eddine, et al.
Published: (2025)
Angular Distance Distribution Loss for Audio Classification
by: Almudévar, Antonio, et al.
Published: (2024)
by: Almudévar, Antonio, et al.
Published: (2024)
Test-Time Training for Depression Detection
by: Dumpala, Sri Harsha, et al.
Published: (2024)
by: Dumpala, Sri Harsha, et al.
Published: (2024)
Test-Time Training for Speech Enhancement
by: Behera, Avishkar, et al.
Published: (2025)
by: Behera, Avishkar, et al.
Published: (2025)
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
by: Vu, Quynh Nguyen-Phuong, et al.
Published: (2025)
by: Vu, Quynh Nguyen-Phuong, et al.
Published: (2025)
Efficient Training of Self-Supervised Speech Foundation Models on a Compute Budget
by: Liu, Andy T., et al.
Published: (2024)
by: Liu, Andy T., et al.
Published: (2024)
Tracking of Intermittent and Moving Speakers : Dataset and Metrics
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
A Phoneme-Scale Assessment of Multichannel Speech Enhancement Algorithms
by: Monir, Nasser-Eddine, et al.
Published: (2024)
by: Monir, Nasser-Eddine, et al.
Published: (2024)
Evaluating Multichannel Speech Enhancement Algorithms at the Phoneme Scale Across Genders
by: Monir, Nasser-Eddine, et al.
Published: (2025)
by: Monir, Nasser-Eddine, et al.
Published: (2025)
EmoHRNet: High-Resolution Neural Network Based Speech Emotion Recognition
by: Muppidi, Akshay, et al.
Published: (2025)
by: Muppidi, Akshay, et al.
Published: (2025)
Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks
by: Fakhruzi, Izhan, et al.
Published: (2026)
by: Fakhruzi, Izhan, et al.
Published: (2026)
Computational music analysis from first principles
by: Tymoczko, Dmitri, et al.
Published: (2024)
by: Tymoczko, Dmitri, et al.
Published: (2024)
Score-Based Training for Energy-Based TTS Models
by: Sun, Wanli, et al.
Published: (2025)
by: Sun, Wanli, et al.
Published: (2025)
Diffusion-based Frameworks for Unsupervised Speech Enhancement
by: Ayilo, Jean-Eudes, et al.
Published: (2026)
by: Ayilo, Jean-Eudes, et al.
Published: (2026)
From Real to Cloned Singer Identification
by: Desblancs, Dorian, et al.
Published: (2024)
by: Desblancs, Dorian, et al.
Published: (2024)
Towards Low-Latency Tracking of Multiple Speakers With Short-Context Speaker Embeddings
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
Combolutional Neural Networks
by: Churchwell, Cameron, et al.
Published: (2025)
by: Churchwell, Cameron, et al.
Published: (2025)
Domain-Invariant Representation Learning of Bird Sounds
by: Moummad, Ilyass, et al.
Published: (2024)
by: Moummad, Ilyass, et al.
Published: (2024)
Speech Command Recognition Using LogNNet Reservoir Computing for Embedded Systems
by: Izotov, Yuriy, et al.
Published: (2025)
by: Izotov, Yuriy, et al.
Published: (2025)
Causal Prosody Mediation for Text-to-Speech:Counterfactual Training of Duration, Pitch, and Energy in FastSpeech2
by: Mohanty, Suvendu Sekhar
Published: (2026)
by: Mohanty, Suvendu Sekhar
Published: (2026)
Audio-Visual Continual Test-Time Adaptation without Forgetting
by: Maharana, Sarthak Kumar, et al.
Published: (2026)
by: Maharana, Sarthak Kumar, et al.
Published: (2026)
Training-Free Multimodal Guidance for Video to Audio Generation
by: Grassucci, Eleonora, et al.
Published: (2025)
by: Grassucci, Eleonora, et al.
Published: (2025)
Training chord recognition models on artificially generated audio
by: Majchrzak, Martyna, et al.
Published: (2025)
by: Majchrzak, Martyna, et al.
Published: (2025)
SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model
by: Stefański, Grzegorz, et al.
Published: (2024)
by: Stefański, Grzegorz, et al.
Published: (2024)
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
by: Feng, Jiu, et al.
Published: (2024)
by: Feng, Jiu, et al.
Published: (2024)
Similar Items
-
Energy Consumption Trends in Sound Event Detection Systems
by: Douwes, Constance, et al.
Published: (2024) -
The Costs of Reproducibility in Music Separation Research: a Replication of Band-Split RNN
by: Magron, Paul, et al.
Published: (2026) -
Normalizing Energy Consumption for Hardware-Independent Evaluation
by: Douwes, Constance, et al.
Published: (2024) -
Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
by: Passoni, Riccardo, et al.
Published: (2025) -
A benchmark of state-of-the-art sound event detection systems evaluated on synthetic soundscapes
by: Ronchini, Francesca, et al.
Published: (2022)