Saved in:
| Main Authors: | Zhou, Wangjin, Zhang, Fengrun, Liu, Yiming, Guan, Wenhao, Zhao, Yi, Kawahara, Tatsuya |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.08039 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025)
by: Dai, Shuqi, et al.
Published: (2025)
Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
by: Zhang, Fengrun, et al.
Published: (2024)
by: Zhang, Fengrun, et al.
Published: (2024)
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)
by: Sha, Binzhu, et al.
Published: (2023)
SONAR: Self-Distilled Continual Pre-training for Domain Adaptive Audio Representation
by: Zhang, Yizhou, et al.
Published: (2025)
by: Zhang, Yizhou, et al.
Published: (2025)
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
Zero-Shot Duet Singing Voices Separation with Diffusion Models
by: Yu, Chin-Yun, et al.
Published: (2023)
by: Yu, Chin-Yun, et al.
Published: (2023)
FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
by: Ferreira, Alef Iury Siqueira, et al.
Published: (2025)
by: Ferreira, Alef Iury Siqueira, et al.
Published: (2025)
LDM-SVC: Latent Diffusion Model Based Zero-Shot Any-to-Any Singing Voice Conversion with Singer Guidance
by: Chen, Shihao, et al.
Published: (2024)
by: Chen, Shihao, et al.
Published: (2024)
HQ-SVC: Towards High-Quality Zero-Shot Singing Voice Conversion in Low-Resource Scenarios
by: Bai, Bingsong, et al.
Published: (2025)
by: Bai, Bingsong, et al.
Published: (2025)
TokSing: Singing Voice Synthesis based on Discrete Tokens
by: Wu, Yuning, et al.
Published: (2024)
by: Wu, Yuning, et al.
Published: (2024)
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
by: Zhang, Xueyao, et al.
Published: (2023)
by: Zhang, Xueyao, et al.
Published: (2023)
ReFlow-VC: Zero-shot Voice Conversion Based on Rectified Flow and Speaker Feature Optimization
by: Ren, Pengyu, et al.
Published: (2025)
by: Ren, Pengyu, et al.
Published: (2025)
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024)
by: Zhang, Yu, et al.
Published: (2024)
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
by: Wang, Kaidi, et al.
Published: (2025)
by: Wang, Kaidi, et al.
Published: (2025)
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
by: Zuo, Jialong, et al.
Published: (2025)
by: Zuo, Jialong, et al.
Published: (2025)
Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
RobustSVC: HuBERT-based Melody Extractor and Adversarial Learning for Robust Singing Voice Conversion
by: Chen, Wei, et al.
Published: (2024)
by: Chen, Wei, et al.
Published: (2024)
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
by: Yao, Jixun, et al.
Published: (2024)
by: Yao, Jixun, et al.
Published: (2024)
An Extensive Analysis of the Singing Voice Conversion Challenge 2025 Evaluation Results
by: Violeta, Lester Phillip, et al.
Published: (2025)
by: Violeta, Lester Phillip, et al.
Published: (2025)
End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions
by: Kang, Wonjune, et al.
Published: (2022)
by: Kang, Wonjune, et al.
Published: (2022)
SoulX-Singer: Towards High-Quality Zero-Shot Singing Voice Synthesis
by: Qian, Jiale, et al.
Published: (2026)
by: Qian, Jiale, et al.
Published: (2026)
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
by: Zeng, Chang, et al.
Published: (2024)
by: Zeng, Chang, et al.
Published: (2024)
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
SingIt! Singer Voice Transformation
by: Eliav, Amit, et al.
Published: (2024)
by: Eliav, Amit, et al.
Published: (2024)
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches
by: Pan, Changhao, et al.
Published: (2026)
by: Pan, Changhao, et al.
Published: (2026)
Residual Speaker Representation for One-Shot Voice Conversion
by: Xu, Le, et al.
Published: (2023)
by: Xu, Le, et al.
Published: (2023)
SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional Testset
by: Zhou, Yiquan, et al.
Published: (2025)
by: Zhou, Yiquan, et al.
Published: (2025)
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
by: Zhang, Yu, et al.
Published: (2025)
by: Zhang, Yu, et al.
Published: (2025)
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
by: Chen, Meiying, et al.
Published: (2022)
by: Chen, Meiying, et al.
Published: (2022)
MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows
by: Ma, Guobin, et al.
Published: (2025)
by: Ma, Guobin, et al.
Published: (2025)
Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)
by: Chen, Zhengyang, et al.
Published: (2024)
LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025)
by: Ma, Te, et al.
Published: (2025)
LAFMA: A Latent Flow Matching Model for Text-to-Audio Generation
by: Guan, Wenhao, et al.
Published: (2024)
by: Guan, Wenhao, et al.
Published: (2024)
Singing Voice Graph Modeling for SingFake Detection
by: Chen, Xuanjun, et al.
Published: (2024)
by: Chen, Xuanjun, et al.
Published: (2024)
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
by: Tang, Yuxun, et al.
Published: (2024)
by: Tang, Yuxun, et al.
Published: (2024)
SingVERSE: A Diverse, Real-World Benchmark for Singing Voice Enhancement
by: Jiang, Shaohan, et al.
Published: (2025)
by: Jiang, Shaohan, et al.
Published: (2025)
Improvement Speaker Similarity for Zero-Shot Any-to-Any Voice Conversion of Whispered and Regular Speech
by: Avdeeva, Anastasia, et al.
Published: (2024)
by: Avdeeva, Anastasia, et al.
Published: (2024)
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
by: Xue, Liumeng, et al.
Published: (2024)
by: Xue, Liumeng, et al.
Published: (2024)
LHQ-SVC: Lightweight and High Quality Singing Voice Conversion Modeling
by: Huang, Yubo, et al.
Published: (2024)
by: Huang, Yubo, et al.
Published: (2024)
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
by: Shi, Jiatong, et al.
Published: (2024)
by: Shi, Jiatong, et al.
Published: (2024)
Similar Items
-
Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025) -
Disentangling Age and Identity with a Mutual Information Minimization Approach for Cross-Age Speaker Verification
by: Zhang, Fengrun, et al.
Published: (2024) -
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023) -
SONAR: Self-Distilled Continual Pre-training for Domain Adaptive Audio Representation
by: Zhang, Yizhou, et al.
Published: (2025) -
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)