Saved in:
Bibliographic Details
Main Authors: Barahona-Ríos, Adrián, Collins, Tom
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2307.08007
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912086951460864
author Barahona-Ríos, Adrián
Collins, Tom
author_facet Barahona-Ríos, Adrián
Collins, Tom
contents Controllable neural audio synthesis of sound effects is a challenging task due to the potential scarcity and spectro-temporal variance of the data. Differentiable digital signal processing (DDSP) synthesisers have been successfully employed to model and control musical and harmonic signals using relatively limited data and computational resources. Here we propose NoiseBandNet, an architecture capable of synthesising and controlling sound effects by filtering white noise through a filterbank, thus going further than previous systems that make assumptions about the harmonic nature of sounds. We evaluate our approach via a series of experiments, modelling footsteps, thunderstorm, pottery, knocking, and metal sound effects. Comparing NoiseBandNet audio reconstruction capabilities to four variants of the DDSP-filtered noise synthesiser, NoiseBandNet scores higher in nine out of ten evaluation categories, establishing a flexible DDSP method for generating time-varying, inharmonic sound effects of arbitrary length with both good time and frequency resolution. Finally, we introduce some potential creative uses of NoiseBandNet, by generating variations, performing loudness transfer, and by training it on user-defined control curves.
format Preprint
id arxiv_https___arxiv_org_abs_2307_08007
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks
Barahona-Ríos, Adrián
Collins, Tom
Sound
Audio and Speech Processing
Controllable neural audio synthesis of sound effects is a challenging task due to the potential scarcity and spectro-temporal variance of the data. Differentiable digital signal processing (DDSP) synthesisers have been successfully employed to model and control musical and harmonic signals using relatively limited data and computational resources. Here we propose NoiseBandNet, an architecture capable of synthesising and controlling sound effects by filtering white noise through a filterbank, thus going further than previous systems that make assumptions about the harmonic nature of sounds. We evaluate our approach via a series of experiments, modelling footsteps, thunderstorm, pottery, knocking, and metal sound effects. Comparing NoiseBandNet audio reconstruction capabilities to four variants of the DDSP-filtered noise synthesiser, NoiseBandNet scores higher in nine out of ten evaluation categories, establishing a flexible DDSP method for generating time-varying, inharmonic sound effects of arbitrary length with both good time and frequency resolution. Finally, we introduce some potential creative uses of NoiseBandNet, by generating variations, performing loudness transfer, and by training it on user-defined control curves.
title NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks
topic Sound
Audio and Speech Processing
url https://arxiv.org/abs/2307.08007