:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Wangjin, Zhang, Fengrun, Liu, Yiming, Guan, Wenhao, Zhao, Yi, Kawahara, Tatsuya
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2409.08039
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025)

Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
by: Zhang, Fengrun, et al.
Published: (2024)

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)

SONAR: Self-Distilled Continual Pre-training for Domain Adaptive Audio Representation
by: Zhang, Yizhou, et al.
Published: (2025)

Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)

Zero-Shot Duet Singing Voices Separation with Diffusion Models
by: Yu, Chin-Yun, et al.
Published: (2023)

FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
by: Ferreira, Alef Iury Siqueira, et al.
Published: (2025)

LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
by: Chen, Shihao, et al.
Published: (2024)

HQ-SVC: Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios
by: Bai, Bingsong, et al.
Published: (2025)

TokSing: Singing Voice Synthesis based on Discrete Tokens
by: Wu, Yuning, et al.
Published: (2024)

Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
by: Zhang, Xueyao, et al.
Published: (2023)

ReFlow-VC: Zero-shot Voice Conversion Based on Rectified Flow and Speaker Feature Optimization
by: Ren, Pengyu, et al.
Published: (2025)

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024)

Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
by: Wang, Kaidi, et al.
Published: (2025)

Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
by: Zuo, Jialong, et al.
Published: (2025)

Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)

RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
by: Chen, Wei, et al.
Published: (2024)

StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
by: Yao, Jixun, et al.
Published: (2024)

An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
by: Kang, Wonjune, et al.
Published: (2022)

SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis
by: Qian, Jiale, et al.
Published: (2026)

InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
by: Zeng, Chang, et al.
Published: (2024)

StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)

SingIt! Singer Voice Transformation
by: Eliav, Amit, et al.
Published: (2024)

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches
by: Pan, Changhao, et al.
Published: (2026)

Residual Speaker Representation for One-Shot Voice Conversion
by: Xu, Le, et al.
Published: (2023)

SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional Testset
by: Zhou, Yiquan, et al.
Published: (2025)

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
by: Zhang, Yu, et al.
Published: (2025)

ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
by: Chen, Meiying, et al.
Published: (2022)

MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows
by: Ma, Guobin, et al.
Published: (2025)

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)

LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025)

LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
by: Guan, Wenhao, et al.
Published: (2024)

Singing Voice Graph Modeling for SingFake Detection
by: Chen, Xuanjun, et al.
Published: (2024)

SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
by: Tang, Yuxun, et al.
Published: (2024)

SingVERSE: A Diverse, Real-World Benchmark for Singing Voice Enhancement
by: Jiang, Shaohan, et al.
Published: (2025)

Improvement Speaker Similarity for Zero-Shot Any-to-Any Voice Conversion of Whispered and Regular Speech
by: Avdeeva, Anastasia, et al.
Published: (2024)

SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
by: Xue, Liumeng, et al.
Published: (2024)

LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling
by: Huang, Yubo, et al.
Published: (2024)

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
by: Shi, Jiatong, et al.
Published: (2024)