Saved in:
Bibliographic Details
Main Authors: Olsen, Kenny Falkær, Østergaard, Mads, Ulbæk, Karl, Nielsen, Søren Føns, Lindrup, Rasmus Malik Høegh, Jensen, Bjørn Sand, Mørup, Morten
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.09768
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914368358187008
author Olsen, Kenny Falkær
Østergaard, Mads
Ulbæk, Karl
Nielsen, Søren Føns
Lindrup, Rasmus Malik Høegh
Jensen, Bjørn Sand
Mørup, Morten
author_facet Olsen, Kenny Falkær
Østergaard, Mads
Ulbæk, Karl
Nielsen, Søren Føns
Lindrup, Rasmus Malik Høegh
Jensen, Bjørn Sand
Mørup, Morten
contents In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter budget and consequently cannot scale to varying compute demands or resources, which limits their use in embedded and heterogeneous devices such as mobile phones and hearables. To enable such use-cases we design a neural network architecture for speech separation and enhancement capable of early-exit, and we propose an uncertainty-aware probabilistic framework to jointly model the clean speech signal and error variance which we use to derive probabilistic early-exit conditions in terms of desired signal-to-noise ratios. We evaluate our methods on both speech separation and enhancement tasks where we demonstrate that early-exit capabilities can be introduced without compromising reconstruction, and that when trained on variable-length audio our early-exit conditions are well-calibrated and lead to considerable compute savings when used to dynamically scale compute at test time while remaining directly interpretable.
format Preprint
id arxiv_https___arxiv_org_abs_2507_09768
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Knowing When to Quit: Probabilistic Early Exits for Speech Separation
Olsen, Kenny Falkær
Østergaard, Mads
Ulbæk, Karl
Nielsen, Søren Føns
Lindrup, Rasmus Malik Høegh
Jensen, Bjørn Sand
Mørup, Morten
Machine Learning
Sound
Audio and Speech Processing
In recent years, deep learning-based single-channel speech separation has improved considerably, in large part driven by increasingly compute- and parameter-efficient neural network architectures. Most such architectures are, however, designed with a fixed compute and parameter budget and consequently cannot scale to varying compute demands or resources, which limits their use in embedded and heterogeneous devices such as mobile phones and hearables. To enable such use-cases we design a neural network architecture for speech separation and enhancement capable of early-exit, and we propose an uncertainty-aware probabilistic framework to jointly model the clean speech signal and error variance which we use to derive probabilistic early-exit conditions in terms of desired signal-to-noise ratios. We evaluate our methods on both speech separation and enhancement tasks where we demonstrate that early-exit capabilities can be introduced without compromising reconstruction, and that when trained on variable-length audio our early-exit conditions are well-calibrated and lead to considerable compute savings when used to dynamically scale compute at test time while remaining directly interpretable.
title Knowing When to Quit: Probabilistic Early Exits for Speech Separation
topic Machine Learning
Sound
Audio and Speech Processing
url https://arxiv.org/abs/2507.09768