:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Nakashika, Toru
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Sound Audio and Speech Processing Signal Processing
Online Access:	https://arxiv.org/abs/2603.26344
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones
by: Ronchini, Francesca, et al.
Published: (2024)

BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
by: Kawamura, Masaya, et al.
Published: (2025)

Bayesian Restoration of Audio Degraded by Low-Frequency Pulses Modeled via Gaussian Process
by: de Carvalho, Hugo Tremonte, et al.
Published: (2020)

A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation
by: Watcharasupat, Karn N., et al.
Published: (2023)

Point Neuron Learning: A New Physics-Informed Neural Network Architecture
by: Bi, Hanwen, et al.
Published: (2024)

GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
by: Liu, Haocheng, et al.
Published: (2024)

AI-Assisted Music Production: A User Study on Text-to-Music Models
by: Ronchini, Francesca, et al.
Published: (2025)

A Convolutional Framework for Mapping Imagined Auditory MEG into Listened Brain Responses
by: Maghsoudi, Maryam, et al.
Published: (2025)

Mismatch-Robust Underwater Acoustic Localization Using A Differentiable Modular Forward Model
by: Kari, Dariush, et al.
Published: (2025)

A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain
by: Gupta, Kishan, et al.
Published: (2022)

A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays
by: Miotello, Federico, et al.
Published: (2024)

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024)

FlowDec: A flow-based full-band general audio codec with high perceptual quality
by: Welker, Simon, et al.
Published: (2025)

CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition
by: Ziogas, Ioannis, et al.
Published: (2024)

Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
by: Chng, Jade, et al.
Published: (2026)

Resampling Filter Design for Multirate Neural Audio Effect Processing
by: Carson, Alistair, et al.
Published: (2025)

Joint Source-Environment Adaptation for Deep Learning-Based Underwater Acoustic Source Ranging
by: Kari, Dariush, et al.
Published: (2025)

Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation
by: Vo, Quoc Thinh, et al.
Published: (2025)

Resource-Efficient Separation Transformer
by: Della Libera, Luca, et al.
Published: (2022)

Self-Tuning Spectral Clustering for Speaker Diarization
by: Raghav, Nikhil, et al.
Published: (2024)

Reconstruction of Sound Field through Diffusion Models
by: Miotello, Federico, et al.
Published: (2023)

Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features
by: Meng, Hanyu, et al.
Published: (2024)

Speech Watermarking with Discrete Intermediate Representations
by: Ji, Shengpeng, et al.
Published: (2024)

Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models
by: Baoueb, Teysir, et al.
Published: (2025)

Listenable Maps for Zero-Shot Audio Classifiers
by: Paissan, Francesco, et al.
Published: (2024)

PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
by: Hono, Yukiya, et al.
Published: (2024)

EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation
by: Manivannan, Mithun, et al.
Published: (2024)

FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time
by: Jobayer, Md, et al.
Published: (2024)

Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns
by: Drossos, Konstantinos, et al.
Published: (2025)

The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
by: Torres, Bernardo, et al.
Published: (2025)

Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)

XAI-Driven Spectral Analysis of Cough Sounds for Respiratory Disease Characterization
by: Amado-Caballero, Patricia, et al.
Published: (2025)

GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis
by: Baoueb, Teysir, et al.
Published: (2025)

Learnable Adaptive Time-Frequency Representation via Differentiable Short-Time Fourier Transform
by: Leiber, Maxime, et al.
Published: (2025)

PD-ADSV: An Automated Diagnosing System Using Voice Signals and Hard Voting Ensemble Method for Parkinson's Disease
by: Ghaheri, Paria, et al.
Published: (2023)

Online speaker diarization of meetings guided by speech separation
by: Gruttadauria, Elio, et al.
Published: (2024)

Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications
by: Diaz-Guerra, David, et al.
Published: (2023)

Listenable Maps for Audio Classifiers
by: Paissan, Francesco, et al.
Published: (2024)

SLiCK: Exploiting Subsequences for Length-Constrained Keyword Spotting
by: Nishu, Kumari, et al.
Published: (2024)

Improving Machine Hearing on Limited Data Sets
by: Harar, Pavol, et al.
Published: (2019)