Saved in:
| Main Author: | Nakashika, Toru |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.26344 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones
by: Ronchini, Francesca, et al.
Published: (2024)
by: Ronchini, Francesca, et al.
Published: (2024)
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
by: Kawamura, Masaya, et al.
Published: (2025)
by: Kawamura, Masaya, et al.
Published: (2025)
Bayesian Restoration of Audio Degraded by Low-Frequency Pulses Modeled via Gaussian Process
by: de Carvalho, Hugo Tremonte, et al.
Published: (2020)
by: de Carvalho, Hugo Tremonte, et al.
Published: (2020)
A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation
by: Watcharasupat, Karn N., et al.
Published: (2023)
by: Watcharasupat, Karn N., et al.
Published: (2023)
Point Neuron Learning: A New Physics-Informed Neural Network Architecture
by: Bi, Hanwen, et al.
Published: (2024)
by: Bi, Hanwen, et al.
Published: (2024)
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model
by: Liu, Haocheng, et al.
Published: (2024)
by: Liu, Haocheng, et al.
Published: (2024)
AI-Assisted Music Production: A User Study on Text-to-Music Models
by: Ronchini, Francesca, et al.
Published: (2025)
by: Ronchini, Francesca, et al.
Published: (2025)
A Convolutional Framework for Mapping Imagined Auditory MEG into Listened Brain Responses
by: Maghsoudi, Maryam, et al.
Published: (2025)
by: Maghsoudi, Maryam, et al.
Published: (2025)
Mismatch-Robust Underwater Acoustic Localization Using A Differentiable Modular Forward Model
by: Kari, Dariush, et al.
Published: (2025)
by: Kari, Dariush, et al.
Published: (2025)
A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain
by: Gupta, Kishan, et al.
Published: (2022)
by: Gupta, Kishan, et al.
Published: (2022)
A Physics-Informed Neural Network-Based Approach for the Spatial Upsampling of Spherical Microphone Arrays
by: Miotello, Federico, et al.
Published: (2024)
by: Miotello, Federico, et al.
Published: (2024)
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis
by: Baoueb, Teysir, et al.
Published: (2024)
by: Baoueb, Teysir, et al.
Published: (2024)
FlowDec: A flow-based full-band general audio codec with high perceptual quality
by: Welker, Simon, et al.
Published: (2025)
by: Welker, Simon, et al.
Published: (2025)
CochCeps-Augment: A Novel Self-Supervised Contrastive Learning Using Cochlear Cepstrum-based Masking for Speech Emotion Recognition
by: Ziogas, Ioannis, et al.
Published: (2024)
by: Ziogas, Ioannis, et al.
Published: (2024)
Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
by: Chng, Jade, et al.
Published: (2026)
by: Chng, Jade, et al.
Published: (2026)
Resampling Filter Design for Multirate Neural Audio Effect Processing
by: Carson, Alistair, et al.
Published: (2025)
by: Carson, Alistair, et al.
Published: (2025)
Joint Source-Environment Adaptation for Deep Learning-Based Underwater Acoustic Source Ranging
by: Kari, Dariush, et al.
Published: (2025)
by: Kari, Dariush, et al.
Published: (2025)
Adaptive Control Attention Network for Underwater Acoustic Localization and Domain Adaptation
by: Vo, Quoc Thinh, et al.
Published: (2025)
by: Vo, Quoc Thinh, et al.
Published: (2025)
Resource-Efficient Separation Transformer
by: Della Libera, Luca, et al.
Published: (2022)
by: Della Libera, Luca, et al.
Published: (2022)
Self-Tuning Spectral Clustering for Speaker Diarization
by: Raghav, Nikhil, et al.
Published: (2024)
by: Raghav, Nikhil, et al.
Published: (2024)
Reconstruction of Sound Field through Diffusion Models
by: Miotello, Federico, et al.
Published: (2023)
by: Miotello, Federico, et al.
Published: (2023)
Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
Speech Watermarking with Discrete Intermediate Representations
by: Ji, Shengpeng, et al.
Published: (2024)
by: Ji, Shengpeng, et al.
Published: (2024)
Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models
by: Baoueb, Teysir, et al.
Published: (2025)
by: Baoueb, Teysir, et al.
Published: (2025)
Listenable Maps for Zero-Shot Audio Classifiers
by: Paissan, Francesco, et al.
Published: (2024)
by: Paissan, Francesco, et al.
Published: (2024)
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
by: Hono, Yukiya, et al.
Published: (2024)
by: Hono, Yukiya, et al.
Published: (2024)
EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation
by: Manivannan, Mithun, et al.
Published: (2024)
by: Manivannan, Mithun, et al.
Published: (2024)
FunnelNet: An End-to-End Deep Learning Framework to Monitor Digital Heart Murmur in Real-Time
by: Jobayer, Md, et al.
Published: (2024)
by: Jobayer, Md, et al.
Published: (2024)
Lightweight DNN for Full-Band Speech Denoising on Mobile Devices: Exploiting Long and Short Temporal Patterns
by: Drossos, Konstantinos, et al.
Published: (2025)
by: Drossos, Konstantinos, et al.
Published: (2025)
The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis
by: Torres, Bernardo, et al.
Published: (2025)
by: Torres, Bernardo, et al.
Published: (2025)
Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)
by: Tokui, Nao, et al.
Published: (2025)
XAI-Driven Spectral Analysis of Cough Sounds for Respiratory Disease Characterization
by: Amado-Caballero, Patricia, et al.
Published: (2025)
by: Amado-Caballero, Patricia, et al.
Published: (2025)
GLA-Grad++: An Improved Griffin-Lim Guided Diffusion Model for Speech Synthesis
by: Baoueb, Teysir, et al.
Published: (2025)
by: Baoueb, Teysir, et al.
Published: (2025)
Learnable Adaptive Time-Frequency Representation via Differentiable Short-Time Fourier Transform
by: Leiber, Maxime, et al.
Published: (2025)
by: Leiber, Maxime, et al.
Published: (2025)
PD-ADSV: An Automated Diagnosing System Using Voice Signals and Hard Voting Ensemble Method for Parkinson's Disease
by: Ghaheri, Paria, et al.
Published: (2023)
by: Ghaheri, Paria, et al.
Published: (2023)
Online speaker diarization of meetings guided by speech separation
by: Gruttadauria, Elio, et al.
Published: (2024)
by: Gruttadauria, Elio, et al.
Published: (2024)
Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications
by: Diaz-Guerra, David, et al.
Published: (2023)
by: Diaz-Guerra, David, et al.
Published: (2023)
Listenable Maps for Audio Classifiers
by: Paissan, Francesco, et al.
Published: (2024)
by: Paissan, Francesco, et al.
Published: (2024)
SLiCK: Exploiting Subsequences for Length-Constrained Keyword Spotting
by: Nishu, Kumari, et al.
Published: (2024)
by: Nishu, Kumari, et al.
Published: (2024)
Improving Machine Hearing on Limited Data Sets
by: Harar, Pavol, et al.
Published: (2019)
by: Harar, Pavol, et al.
Published: (2019)
Similar Items
-
Room Transfer Function Reconstruction Using Complex-valued Neural Networks and Irregularly Distributed Microphones
by: Ronchini, Francesca, et al.
Published: (2024) -
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing
by: Kawamura, Masaya, et al.
Published: (2025) -
Bayesian Restoration of Audio Degraded by Low-Frequency Pulses Modeled via Gaussian Process
by: de Carvalho, Hugo Tremonte, et al.
Published: (2020) -
A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation
by: Watcharasupat, Karn N., et al.
Published: (2023) -
Point Neuron Learning: A New Physics-Informed Neural Network Architecture
by: Bi, Hanwen, et al.
Published: (2024)