:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Diaz, Rodrigo, Sandler, Mark
Format:	Preprint
Published:	2025
Subjects:	Sound Machine Learning Audio and Speech Processing Computational Physics
Online Access:	https://arxiv.org/abs/2505.05940
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Efficient Modelling of String Dynamics: A Comparison of State Space and Koopman based Deep Learning Methods
by: Diaz, Rodrigo, et al.
Published: (2024)

nlm: Real-Time Non-linear Modal Synthesis in Max
by: Diaz, Rodrigo, et al.
Published: (2026)

Evaluation of Neural Surrogates for Physical Modelling Synthesis of Nonlinear Elastic Plates
by: Martin, Carlos De La Vega, et al.
Published: (2025)

Stable Differentiable Modal Synthesis for Learning Nonlinear Dynamics
by: Zheleznov, Victor, et al.
Published: (2026)

Learning Nonlinear Dynamics in Physical Modelling Synthesis using Neural Ordinary Differential Equations
by: Zheleznov, Victor, et al.
Published: (2025)

A Conditioned UNet for Music Source Separation
by: O'Hanlon, Ken, et al.
Published: (2025)

Designing Neural Synthesizers for Low-Latency Interaction
by: Caspe, Franco, et al.
Published: (2025)

Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
by: Huang, Yujia, et al.
Published: (2024)

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters
by: Tesch, Kristina, et al.
Published: (2023)

Physics and geometry informed neural operator network with application to acoustic scattering
by: Nair, Siddharth, et al.
Published: (2024)

Cascaded Cross-Modal Transformer for Audio-Textual Classification
by: Ristea, Nicolae-Catalin, et al.
Published: (2024)

Sparse Binarization for Fast Keyword Spotting
by: Svirsky, Jonathan, et al.
Published: (2024)

Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment
by: Cohen, Ohad, et al.
Published: (2024)

Fast Timing-Conditioned Latent Audio Diffusion
by: Evans, Zach, et al.
Published: (2024)

Towards Robust FastSpeech 2 by Modelling Residual Multimodality
by: Kögel, Fabian, et al.
Published: (2023)

Modulation Discovery with Differentiable Digital Signal Processing
by: Mitcheltree, Christopher, et al.
Published: (2025)

Fast and Flexible Audio Bandwidth Extension via Vocos
by: Sharma, Yatharth
Published: (2026)

Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation
by: Lee, Jin Woo, et al.
Published: (2024)

Differentiable All-pole Filters for Time-varying Audio Systems
by: Yu, Chin-Yun, et al.
Published: (2024)

Enhancing Speech Emotion Recognition Through Differentiable Architecture Search
by: Rajapakshe, Thejan, et al.
Published: (2023)

SimulTron: On-Device Simultaneous Speech to Speech Translation
by: Agranovich, Alex, et al.
Published: (2024)

Decodable but not structured: linear probing enables Underwater Acoustic Target Recognition with pretrained audio embeddings
by: Hummel, Hilde I., et al.
Published: (2026)

Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper
by: Syed, Jaza, et al.
Published: (2025)

A Differentiable Alignment Framework for Sequence-to-Sequence Modeling via Optimal Transport
by: Kaloga, Yacouba, et al.
Published: (2025)

Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis
by: Agrawal, Prabhav, et al.
Published: (2024)

Audio Simulation for Sound Source Localization in Virtual Evironment
by: Di Yuan, Yi, et al.
Published: (2024)

Towards the Synthesis of Non-speech Vocalizations
by: Hoq, Enjamamul, et al.
Published: (2024)

MiSTR: Multi-Modal iEEG-to-Speech Synthesis with Transformer-Based Prosody Prediction and Neural Phase Reconstruction
by: Al-Radhi, Mohammed Salah, et al.
Published: (2025)

Computational music analysis from first principles
by: Tymoczko, Dmitri, et al.
Published: (2024)

Audio Editing with Non-Rigid Text Prompts
by: Paissan, Francesco, et al.
Published: (2023)

An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility
by: Fernández-Díaz, Miguel, et al.
Published: (2024)

Revisit Modality Imbalance at the Decision Layer
by: Ma, Xiaoyu, et al.
Published: (2025)

Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
by: Cheng, Longbiao, et al.
Published: (2024)

Data-Driven Room Acoustic Modeling Via Differentiable Feedback Delay Networks With Learnable Delay Lines
by: Mezza, Alessandro Ilic, et al.
Published: (2024)

Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction
by: Kim, Minsu, et al.
Published: (2025)

Non-verbal information in spontaneous speech -- towards a new framework of analysis
by: Biron, Tirza, et al.
Published: (2024)

Rethinking Non-Negative Matrix Factorization with Implicit Neural Representations
by: Subramani, Krishna, et al.
Published: (2024)

Biomimetic Frontend for Differentiable Audio Processing
by: Famularo, Ruolan Leslie, et al.
Published: (2024)

LACTOSE: Linear Array of Conditions, TOpologies with Separated Error-backpropagation -- The Differentiable "IF" Conditional for Differentiable Digital Signal Processing
by: Clarke, Christopher Johann
Published: (2025)

Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
by: Yan, Yujia, et al.
Published: (2024)