Saved in:
| Main Authors: | Huang, Jiawei, Zhang, Chen, Ren, Yi, Jiang, Ziyue, Ye, Zhenhui, Liu, Jinglin, He, Jinzheng, Yin, Xiang, Zhao, Zhou |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.04708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
by: Jiang, Ziyue, et al.
Published: (2023)
by: Jiang, Ziyue, et al.
Published: (2023)
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024)
by: Zhang, Yu, et al.
Published: (2024)
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
by: Yao, Jixun, et al.
Published: (2024)
by: Yao, Jixun, et al.
Published: (2024)
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
by: Wang, Kaidi, et al.
Published: (2025)
by: Wang, Kaidi, et al.
Published: (2025)
MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows
by: Ma, Guobin, et al.
Published: (2025)
by: Ma, Guobin, et al.
Published: (2025)
EAD-VC: Enhancing Speech Auto-Disentanglement for Voice Conversion with IFUB Estimator and Joint Text-Guided Consistent Learning
by: Liang, Ziqi, et al.
Published: (2024)
by: Liang, Ziqi, et al.
Published: (2024)
ReFlow-VC: Zero-shot Voice Conversion Based on Rectified Flow and Speaker Feature Optimization
by: Ren, Pengyu, et al.
Published: (2025)
by: Ren, Pengyu, et al.
Published: (2025)
VC-ENHANCE: Speech Restoration with Integrated Noise Suppression and Voice Conversion
by: Byun, Kyungguen, et al.
Published: (2024)
by: Byun, Kyungguen, et al.
Published: (2024)
SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark
by: Saito, Yuki, et al.
Published: (2024)
by: Saito, Yuki, et al.
Published: (2024)
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
by: Ning, Ziqian, et al.
Published: (2023)
by: Ning, Ziqian, et al.
Published: (2023)
PseudoVC: Improving One-shot Voice Conversion with Pseudo Paired Data
by: Cao, Songjun, et al.
Published: (2025)
by: Cao, Songjun, et al.
Published: (2025)
Pureformer-VC: Non-parallel Voice Conversion with Pure Stylized Transformer Blocks and Triplet Discriminative Training
by: Yao, Wenhan, et al.
Published: (2025)
by: Yao, Wenhan, et al.
Published: (2025)
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
by: Chen, Meiying, et al.
Published: (2022)
by: Chen, Meiying, et al.
Published: (2022)
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
by: Ma, Linhan, et al.
Published: (2024)
by: Ma, Linhan, et al.
Published: (2024)
Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling
by: Yang, Yuguang, et al.
Published: (2024)
by: Yang, Yuguang, et al.
Published: (2024)
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
by: Niu, Xinlei, et al.
Published: (2024)
by: Niu, Xinlei, et al.
Published: (2024)
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
by: Zuo, Jialong, et al.
Published: (2025)
by: Zuo, Jialong, et al.
Published: (2025)
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
by: Zuo, Jialong, et al.
Published: (2025)
by: Zuo, Jialong, et al.
Published: (2025)
Zero-shot Cross-lingual Voice Transfer for TTS
by: Biadsy, Fadi, et al.
Published: (2024)
by: Biadsy, Fadi, et al.
Published: (2024)
SynthVC: Leveraging Synthetic Data for End-to-End Low Latency Streaming Voice Conversion
by: Guo, Zhao, et al.
Published: (2025)
by: Guo, Zhao, et al.
Published: (2025)
CoDiff-VC: A Codec-Assisted Diffusion Model for Zero-shot Voice Conversion
by: Li, Yuke, et al.
Published: (2024)
by: Li, Yuke, et al.
Published: (2024)
AdaptVC: High Quality Voice Conversion with Adaptive Learning
by: Kim, Jaehun, et al.
Published: (2025)
by: Kim, Jaehun, et al.
Published: (2025)
StreamVC: Real-Time Low-Latency Voice Conversion
by: Yang, Yang, et al.
Published: (2024)
by: Yang, Yang, et al.
Published: (2024)
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
by: Tu, Huu Tuong, et al.
Published: (2025)
by: Tu, Huu Tuong, et al.
Published: (2025)
Residual Speaker Representation for One-Shot Voice Conversion
by: Xu, Le, et al.
Published: (2023)
by: Xu, Le, et al.
Published: (2023)
EZ-VC: Easy Zero-shot Any-to-Any Voice Conversion
by: Joglekar, Advait, et al.
Published: (2025)
by: Joglekar, Advait, et al.
Published: (2025)
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
by: Neekhara, Paarth, et al.
Published: (2023)
by: Neekhara, Paarth, et al.
Published: (2023)
Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis
by: Hu, Yifan, et al.
Published: (2025)
by: Hu, Yifan, et al.
Published: (2025)
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
by: Jiang, Ziyue, et al.
Published: (2025)
by: Jiang, Ziyue, et al.
Published: (2025)
Pureformer-VC: Non-parallel One-Shot Voice Conversion with Pure Transformer Blocks and Triplet Discriminative Training
by: Yao, Wenhan, et al.
Published: (2024)
by: Yao, Wenhan, et al.
Published: (2024)
QR-VC: Leveraging Quantization Residuals for Linear Disentanglement in Zero-Shot Voice Conversion
by: Sim, Youngjun, et al.
Published: (2024)
by: Sim, Youngjun, et al.
Published: (2024)
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
by: Li, Junjie, et al.
Published: (2023)
by: Li, Junjie, et al.
Published: (2023)
USM-VC: Mitigating Timbre Leakage with Universal Semantic Mapping Residual Block for Voice Conversion
by: Li, Na, et al.
Published: (2025)
by: Li, Na, et al.
Published: (2025)
Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals
by: Seki, Kentaro, et al.
Published: (2024)
by: Seki, Kentaro, et al.
Published: (2024)
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
by: Zhou, Wangjin, et al.
Published: (2024)
by: Zhou, Wangjin, et al.
Published: (2024)
Complex-Cycle-Consistent Diffusion Model for Monaural Speech Enhancement
by: Li, Yi, et al.
Published: (2024)
by: Li, Yi, et al.
Published: (2024)
In This Environment, As That Speaker: A Text-Driven Framework for Multi-Attribute Speech Conversion
by: Jin, Jiawei, et al.
Published: (2025)
by: Jin, Jiawei, et al.
Published: (2025)
Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model
by: Du, Zongyang, et al.
Published: (2024)
by: Du, Zongyang, et al.
Published: (2024)
JoyVoice: Long-Context Conditioning for Anthropomorphic Multi-Speaker Conversational Synthesis
by: Yu, Fan, et al.
Published: (2025)
by: Yu, Fan, et al.
Published: (2025)
Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
by: Ellinas, Nikolaos, et al.
Published: (2022)
by: Ellinas, Nikolaos, et al.
Published: (2022)
Similar Items
-
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
by: Jiang, Ziyue, et al.
Published: (2023) -
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
by: Zhang, Yu, et al.
Published: (2024) -
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching
by: Yao, Jixun, et al.
Published: (2024) -
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion
by: Wang, Kaidi, et al.
Published: (2025) -
MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows
by: Ma, Guobin, et al.
Published: (2025)