:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Peladeau, Côme, Peeters, Geoffroy
Format:	Preprint
Published:	2023
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2310.11781
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scaling up masked audio encoder learning for general audio classification
by: Dinkel, Heinrich, et al.
Published: (2024)

Episodic fine-tuning prototypical networks for optimization-based few-shot learning: Application to audio classification
by: Zhuang, Xuanyu, et al.
Published: (2024)

A Contrastive Self-Supervised Learning scheme for beat tracking amenable to few-shot learning
by: Gagnere, Antonin, et al.
Published: (2024)

Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking
by: Gagnere, Antonin, et al.
Published: (2025)

PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective
by: Riou, Alain, et al.
Published: (2023)

Tweaking autoregressive methods for inpainting of gaps in audio signals
by: Mokrý, Ondřej, et al.
Published: (2024)

Regularized autoregressive modeling and its application to audio signal reconstruction
by: Mokrý, Ondřej, et al.
Published: (2024)

Compositional nonlinear audio signal processing with Volterra series
by: Araujo-Simon, Jake
Published: (2023)

Testing chatbots on the creation of encoders for audio conditioned image generation
by: León, Jorge E., et al.
Published: (2025)

Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport
by: Torres, Bernardo, et al.
Published: (2023)

The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
by: Torres, Bernardo, et al.
Published: (2025)

Real-time implementation of vibrato transfer as an audio effect
by: Hyrkas, Jeremy
Published: (2025)

Efficient learning-based sound propagation for virtual and real-world audio processing applications
by: Ratnarajah, Anton Jeran
Published: (2024)

Speaker anonymization using neural audio codec language models
by: Panariello, Michele, et al.
Published: (2023)

The role of audio-visual integration in the time course of phonetic encoding in self-supervised speech models
by: Wang, Yi, et al.
Published: (2025)

An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge
by: Han, Runduo, et al.
Published: (2024)

Zero-shot Musical Stem Retrieval with Joint-Embedding Predictive Architectures
by: Riou, Alain, et al.
Published: (2024)

Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation
by: Riou, Alain, et al.
Published: (2024)

Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline
by: Riley, Xavier, et al.
Published: (2024)

Towards audio language modeling -- an overview
by: Wu, Haibin, et al.
Published: (2024)

Why some audio signal short-time Fourier transform coefficients have nonuniform phase distributions
by: Voran, Stephen D.
Published: (2024)

Are audio DeepFake detection models polyglots?
by: Marek, Bartłomiej, et al.
Published: (2024)

Low-power SNN-based audio source localisation using a Hilbert Transform spike encoding scheme
by: Haghighatshoar, Saeid, et al.
Published: (2024)

Self-supervised learning method using multiple sampling strategies for general-purpose audio representation
by: Kuroyanagi, Ibuki, et al.
Published: (2025)

MBCodec:Thorough disentangle for high-fidelity audio compression
by: Zhang, Ruonan, et al.
Published: (2025)

AEROMamba: An efficient architecture for audio super-resolution using generative adversarial networks and state space models
by: Abreu, Wallace, et al.
Published: (2024)

Comparison of fundamental frequency estimators with subharmonic voice signals
by: Ikuma, Takeshi, et al.
Published: (2025)

FxSearcher: gradient-free text-driven audio transformation
by: Ki, Hojoon, et al.
Published: (2025)

EDTC: enhance depth of text comprehension in automated audio captioning
by: Tan, Liwen, et al.
Published: (2024)

Exploring bat song syllable representations in self-supervised audio encoders
by: Kloots, Marianne de Heer, et al.
Published: (2024)

Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport
by: Torres, Bernardo, et al.
Published: (2025)

Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning
by: Riou, Alain, et al.
Published: (2024)

AxLSTMs: learning self-supervised audio representations with xLSTMs
by: Yadav, Sarthak, et al.
Published: (2024)

STASE: A spatialized text-to-audio synthesis engine for music generation
by: Chi, Tutti, et al.
Published: (2025)

Deep learning based spatial aliasing reduction in beamforming for audio capture
by: Guzik, Mateusz, et al.
Published: (2025)

Enhancement by postfiltering for speech and audio coding in ad-hoc sensor networks
by: Das, Sneha, et al.
Published: (2020)

Human-CLAP: Human-perception-based contrastive language-audio pretraining
by: Takano, Taisei, et al.
Published: (2025)

DashengTokenizer: One layer is enough for unified audio understanding and generation
by: Dinkel, Heinrich, et al.
Published: (2026)

RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio
by: Kanamori, Yusuke, et al.
Published: (2025)

ICGAN: An implicit conditioning method for interpretable feature control of neural audio synthesis
by: Liu, Yunyi, et al.
Published: (2024)