Saved in:
| Main Authors: | Gupta, Ankur, Rai, Anshul, Bansal, Archit, Arora, Vipul |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.02432 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents
by: Banerjee, Adhiraj, et al.
Published: (2025)
by: Banerjee, Adhiraj, et al.
Published: (2025)
Deepfake Detection of Singing Voices With Whisper Encodings
by: Sharma, Falguni, et al.
Published: (2025)
by: Sharma, Falguni, et al.
Published: (2025)
Zero-Shot Duet Singing Voices Separation with Diffusion Models
by: Yu, Chin-Yun, et al.
Published: (2023)
by: Yu, Chin-Yun, et al.
Published: (2023)
Facing the Music: Tackling Singing Voice Separation in Cinematic Audio Source Separation
by: Watcharasupat, Karn N., et al.
Published: (2024)
by: Watcharasupat, Karn N., et al.
Published: (2024)
SingFake: Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2023)
by: Zang, Yongyi, et al.
Published: (2023)
Perceived Femininity in Singing Voice: Analysis and Prediction
by: Kong, Yuexuan, et al.
Published: (2025)
by: Kong, Yuexuan, et al.
Published: (2025)
SingIt! Singer Voice Transformation
by: Eliav, Amit, et al.
Published: (2024)
by: Eliav, Amit, et al.
Published: (2024)
TokSing: Singing Voice Synthesis based on Discrete Tokens
by: Wu, Yuning, et al.
Published: (2024)
by: Wu, Yuning, et al.
Published: (2024)
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
Efficient and Fast Generative-Based Singing Voice Separation using a Latent Diffusion Model
by: Plaja-Roglans, Genís, et al.
Published: (2025)
by: Plaja-Roglans, Genís, et al.
Published: (2025)
Singing Voice Graph Modeling for SingFake Detection
by: Chen, Xuanjun, et al.
Published: (2024)
by: Chen, Xuanjun, et al.
Published: (2024)
Adapting Speech Language Model to Singing Voice Synthesis
by: Zhao, Yiwen, et al.
Published: (2025)
by: Zhao, Yiwen, et al.
Published: (2025)
PairAlign: A Framework for Sequence Tokenization via Self-Alignment with Applications to Audio Tokenization
by: Banerjee, Adhiraj, et al.
Published: (2026)
by: Banerjee, Adhiraj, et al.
Published: (2026)
DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation
by: Wei, Haojie, et al.
Published: (2024)
by: Wei, Haojie, et al.
Published: (2024)
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)
by: Sha, Binzhu, et al.
Published: (2023)
SingVERSE: A Diverse, Real-World Benchmark for Singing Voice Enhancement
by: Jiang, Shaohan, et al.
Published: (2025)
by: Jiang, Shaohan, et al.
Published: (2025)
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
by: Tang, Yuxun, et al.
Published: (2024)
by: Tang, Yuxun, et al.
Published: (2024)
Self-Supervised Singing Voice Pre-Training towards Speech-to-Singing Conversion
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
InstructSing: High-Fidelity Singing Voice Generation via Instructing Yourself
by: Zeng, Chang, et al.
Published: (2024)
by: Zeng, Chang, et al.
Published: (2024)
CartoonSing: Unifying Human and Nonhuman Timbres in Singing Generation
by: Han, Jionghao, et al.
Published: (2025)
by: Han, Jionghao, et al.
Published: (2025)
Interactive singing melody extraction based on active adaptation
by: Saxena, Kavya Ranjan, et al.
Published: (2024)
by: Saxena, Kavya Ranjan, et al.
Published: (2024)
Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)
by: Li, Ruiqi, et al.
Published: (2024)
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025)
by: Dai, Shuqi, et al.
Published: (2025)
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
by: Shi, Jiatong, et al.
Published: (2024)
by: Shi, Jiatong, et al.
Published: (2024)
Robust Training of Singing Voice Synthesis Using Prior and Posterior Uncertainty
by: Zhao, Yiwen, et al.
Published: (2025)
by: Zhao, Yiwen, et al.
Published: (2025)
Poly-SVC: Polyphony-Aware Singing Voice Conversion with Harmonic Modeling
by: Geng, Chen, et al.
Published: (2026)
by: Geng, Chen, et al.
Published: (2026)
LAPS-Diff: A Diffusion-Based Framework for Singing Voice Synthesis With Language Aware Prosody-Style Guided Learning
by: Dhar, Sandipan, et al.
Published: (2025)
by: Dhar, Sandipan, et al.
Published: (2025)
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
by: Xue, Liumeng, et al.
Published: (2024)
by: Xue, Liumeng, et al.
Published: (2024)
Attention-Based Audio Embeddings for Query-by-Example
by: Singh, Anup, et al.
Published: (2022)
by: Singh, Anup, et al.
Published: (2022)
CONTUNER: Singing Voice Beautifying with Pitch and Expressiveness Condition
by: Wang, Jianzong, et al.
Published: (2024)
by: Wang, Jianzong, et al.
Published: (2024)
UNMIXX: Untangling Highly Correlated Singing Voices Mixtures
by: Jung, Jihoo, et al.
Published: (2026)
by: Jung, Jihoo, et al.
Published: (2026)
Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation
by: Li, Xueyan, et al.
Published: (2025)
by: Li, Xueyan, et al.
Published: (2025)
Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
by: Kim, Tae-Woo, et al.
Published: (2022)
by: Kim, Tae-Woo, et al.
Published: (2022)
BiSinger: Bilingual Singing Voice Synthesis
by: Zhou, Huali, et al.
Published: (2023)
by: Zhou, Huali, et al.
Published: (2023)
Automatic Estimation of Singing Voice Musical Dynamics
by: Narang, Jyoti, et al.
Published: (2024)
by: Narang, Jyoti, et al.
Published: (2024)
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching
by: Guo, Wenxiang, et al.
Published: (2025)
by: Guo, Wenxiang, et al.
Published: (2025)
Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches
by: Pan, Changhao, et al.
Published: (2026)
by: Pan, Changhao, et al.
Published: (2026)
TeLeS: Temporal Lexeme Similarity Score to Estimate Confidence in End-to-End ASR
by: Ravi, Nagarathna, et al.
Published: (2024)
by: Ravi, Nagarathna, et al.
Published: (2024)
PerformSinger: Multimodal Singing Voice Synthesis Leveraging Synchronized Lip Cues from Singing Performance Videos
by: Gu, Ke, et al.
Published: (2025)
by: Gu, Ke, et al.
Published: (2025)
Similar Items
-
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents
by: Banerjee, Adhiraj, et al.
Published: (2025) -
Deepfake Detection of Singing Voices With Whisper Encodings
by: Sharma, Falguni, et al.
Published: (2025) -
Zero-Shot Duet Singing Voices Separation with Diffusion Models
by: Yu, Chin-Yun, et al.
Published: (2023) -
Facing the Music: Tackling Singing Voice Separation in Cinematic Audio Source Separation
by: Watcharasupat, Karn N., et al.
Published: (2024) -
SingFake: Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2023)