:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yenan, Kolkman, Guilly, Watanabe, Hiroshi
Format:	Preprint
Published:	2023
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2306.11282
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Ambisonics Super-Resolution Using A Waveform-Domain Neural Network
by: Nawfal, Ismael, et al.
Published: (2025)

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution
by: Zhao, Shengkui, et al.
Published: (2025)

Rage Music Classification and Analysis using K-Nearest Neighbour, Random Forest, Support Vector Machine, Convolutional Neural Networks, and Gradient Boosting
by: Kumar, Akul
Published: (2024)

Barwise Section Boundary Detection in Symbolic Music Using Convolutional Neural Networks
by: Eldeeb, Omar, et al.
Published: (2025)

Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification
by: Dong, Mingwen
Published: (2018)

Audio Inpainting in Time-Frequency Domain with Phase-Aware Prior
by: Balušík, Peter, et al.
Published: (2026)

ESPnet-Codec: Comprehensive Training and Evaluation of Neural Codecs for Audio, Music, and Speech
by: Shi, Jiatong, et al.
Published: (2024)

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
by: Zheng, Youqiang, et al.
Published: (2024)

Musical Metamerism with Time--Frequency Scattering
by: Lostanlen, Vincent, et al.
Published: (2026)

Music Emotion Prediction Using Recurrent Neural Networks
by: Chang, Xinyu, et al.
Published: (2024)

Music Style Transfer with Time-Varying Inversion of Diffusion Models
by: Li, Sifei, et al.
Published: (2024)

The Arrow of Time in Music -- Revisiting the Temporal Structure of Music with Distinguishability and Unique Orientability as the Anchor Point
by: Xu, Qi
Published: (2023)

STSR: High-Fidelity Speech Super-Resolution via Spectral-Transient Context Modeling
by: Yuan, Jiajun, et al.
Published: (2025)

A Two-Stage Band-Split Mamba-2 Network For Music Separation
by: Bai, Jinglin, et al.
Published: (2024)

Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
by: Imamura, Kanami, et al.
Published: (2025)

Enhancing Neural Audio Fingerprint Robustness to Audio Degradation for Music Identification
by: Araz, R. Oguz, et al.
Published: (2025)

A Neural Score Follower for Computer Accompaniment of Polyphonic Musical Instruments
by: Pillay, Ashwin
Published: (2025)

Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution
by: Yu, Chin-Yun, et al.
Published: (2022)

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
by: Koo, Junghyun, et al.
Published: (2024)

InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation
by: Zhang, Chong, et al.
Published: (2025)

SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics
by: Saeki, Takaaki, et al.
Published: (2024)

Musical Word Embedding for Music Tagging and Retrieval
by: Doh, SeungHeon, et al.
Published: (2024)

Phase-Retrieval-Based Physics-Informed Neural Networks For Acoustic Magnitude Field Reconstruction
by: Schrader, Karl, et al.
Published: (2026)

Network Modulation Synthesis: New Algorithms for Generating Musical Audio Using Autoencoder Networks
by: Hyrkas, Jeremy
Published: (2021)

Real-Time Pitch/F0 Detection Using Spectrogram Images and Convolutional Neural Networks
by: Zhao, Xufang, et al.
Published: (2025)

MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
by: Bai, Ye, et al.
Published: (2024)

Music De-limiter Networks via Sample-wise Gain Inversion
by: Jeon, Chang-Bin, et al.
Published: (2023)

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation
by: Zhao, Shengkui, et al.
Published: (2023)

Music2Fail: Transfer Music to Failed Recorder Style
by: Leong, Chon In, et al.
Published: (2024)

PrimeK-Net: Multi-scale Spectral Learning via Group Prime-Kernel Convolutional Neural Networks for Single Channel Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2025)

FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
by: Comanducci, Luca, et al.
Published: (2024)

ITO-Master: Inference-Time Optimization for Audio Effects Modeling of Music Mastering Processors
by: Koo, Junghyun, et al.
Published: (2025)

Learning and composing of classical music using restricted Boltzmann machines
by: Kobayashi, Mutsumi, et al.
Published: (2025)

Building speech corpus with diverse voice characteristics for its prompt-based representation
by: Watanabe, Aya, et al.
Published: (2024)

RaD-Net: A Repairing and Denoising Network for Speech Signal Improvement
by: Liu, Mingshuai, et al.
Published: (2024)

Improving Controllability and Editability for Pretrained Text-to-Music Generation Models
by: Zhang, Yixiao
Published: (2024)

BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec
by: Xin, Detai, et al.
Published: (2024)

Real time fault detection in 3D printers using Convolutional Neural Networks and acoustic signals
by: Waheed, Muhammad Fasih, et al.
Published: (2026)

Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network
by: Li, Yanxiong, et al.
Published: (2024)