Enregistré dans:
| Auteurs principaux: | Cwitkowitz, Frank, Duan, Zhiyao |
|---|---|
| Format: | Preprint |
| Publié: |
2025
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2506.23371 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
Toward Fully Self-Supervised Multi-Pitch Estimation
par: Cwitkowitz, Frank, et autres
Publié: (2024)
par: Cwitkowitz, Frank, et autres
Publié: (2024)
SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription
par: Zang, Yongyi, et autres
Publié: (2023)
par: Zang, Yongyi, et autres
Publié: (2023)
Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport
par: Torres, Bernardo, et autres
Publié: (2025)
par: Torres, Bernardo, et autres
Publié: (2025)
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
par: Chen, Meiying, et autres
Publié: (2022)
par: Chen, Meiying, et autres
Publié: (2022)
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
par: Yan, Yujia, et autres
Publié: (2024)
par: Yan, Yujia, et autres
Publié: (2024)
Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription
par: Cwitkowitz, Frank, et autres
Publié: (2023)
par: Cwitkowitz, Frank, et autres
Publié: (2023)
PESTO: Real-Time Pitch Estimation with Self-supervised Transposition-equivariant Objective
par: Riou, Alain, et autres
Publié: (2025)
par: Riou, Alain, et autres
Publié: (2025)
Self-Supervised Embeddings for Detecting Individual Symptoms of Depression
par: Dumpala, Sri Harsha, et autres
Publié: (2024)
par: Dumpala, Sri Harsha, et autres
Publié: (2024)
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
par: Park, Hyun Jin, et autres
Publié: (2024)
par: Park, Hyun Jin, et autres
Publié: (2024)
Pseudo-Cepstrum: Pitch Modification for Mel-Based Neural Vocoders
par: Ellinas, Nikolaos, et autres
Publié: (2025)
par: Ellinas, Nikolaos, et autres
Publié: (2025)
HyperGANStrument: Instrument Sound Synthesis and Editing with Pitch-Invariant Hypernetworks
par: Zhang, Zhe, et autres
Publié: (2024)
par: Zhang, Zhe, et autres
Publié: (2024)
MT-SLVR: Multi-Task Self-Supervised Learning for Transformation In(Variant) Representations
par: Heggan, Calum, et autres
Publié: (2023)
par: Heggan, Calum, et autres
Publié: (2023)
Self-Supervised Speech Quality Estimation and Enhancement Using Only Clean Speech
par: Fu, Szu-Wei, et autres
Publié: (2024)
par: Fu, Szu-Wei, et autres
Publié: (2024)
Investigating Confidence Estimation Measures for Speaker Diarization
par: Chowdhury, Anurag, et autres
Publié: (2024)
par: Chowdhury, Anurag, et autres
Publié: (2024)
Towards Supervised Performance on Speaker Verification with Self-Supervised Learning by Leveraging Large-Scale ASR Models
par: Miara, Victor, et autres
Publié: (2024)
par: Miara, Victor, et autres
Publié: (2024)
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
par: Wu, Haibin, et autres
Publié: (2021)
par: Wu, Haibin, et autres
Publié: (2021)
Self-Supervised Learning for Speaker Recognition: A study and review
par: Lepage, Theo, et autres
Publié: (2026)
par: Lepage, Theo, et autres
Publié: (2026)
Singer Identity Representation Learning using Self-Supervised Techniques
par: Torres, Bernardo, et autres
Publié: (2024)
par: Torres, Bernardo, et autres
Publié: (2024)
Self-Supervised Learning for Few-Shot Bird Sound Classification
par: Moummad, Ilyass, et autres
Publié: (2023)
par: Moummad, Ilyass, et autres
Publié: (2023)
PESTO: Pitch Estimation with Self-supervised Transposition-equivariant Objective
par: Riou, Alain, et autres
Publié: (2023)
par: Riou, Alain, et autres
Publié: (2023)
Self-Supervised Frameworks for Speaker Verification via Bootstrapped Positive Sampling
par: Lepage, Theo, et autres
Publié: (2025)
par: Lepage, Theo, et autres
Publié: (2025)
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
par: Heggan, Calum, et autres
Publié: (2024)
par: Heggan, Calum, et autres
Publié: (2024)
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning
par: Vaessen, Nik, et autres
Publié: (2024)
par: Vaessen, Nik, et autres
Publié: (2024)
Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning
par: Lepage, Théo, et autres
Publié: (2022)
par: Lepage, Théo, et autres
Publié: (2022)
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction
par: Brima, Yusuf, et autres
Publié: (2023)
par: Brima, Yusuf, et autres
Publié: (2023)
Additive Margin in Contrastive Self-Supervised Frameworks to Learn Discriminative Speaker Representations
par: Lepage, Theo, et autres
Publié: (2024)
par: Lepage, Theo, et autres
Publié: (2024)
Improving Perceptual Audio Aesthetic Assessment via Triplet Loss and Self-Supervised Embeddings
par: Wisnu, Dyah A. M. G., et autres
Publié: (2025)
par: Wisnu, Dyah A. M. G., et autres
Publié: (2025)
Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features
par: Chen, Wei, et autres
Publié: (2025)
par: Chen, Wei, et autres
Publié: (2025)
Phoneme-Level Deepfake Detection Across Emotional Conditions Using Self-Supervised Embeddings
par: Nallaguntla, Vamshi, et autres
Publié: (2026)
par: Nallaguntla, Vamshi, et autres
Publié: (2026)
Enhancing Audio-Language Models through Self-Supervised Post-Training with Text-Audio Pairs
par: Sinha, Anshuman, et autres
Publié: (2024)
par: Sinha, Anshuman, et autres
Publié: (2024)
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
par: Hono, Yukiya, et autres
Publié: (2024)
par: Hono, Yukiya, et autres
Publié: (2024)
A Lightweight Slot-Attention Framework for Multi-Instrument Multi-Pitch Estimation
par: Taenzer, Michael
Publié: (2026)
par: Taenzer, Michael
Publié: (2026)
Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
par: Kakoulidis, Panos, et autres
Publié: (2024)
par: Kakoulidis, Panos, et autres
Publié: (2024)
HARP 2.0: Expanding Hosted, Asynchronous, Remote Processing for Deep Learning in the DAW
par: Benetatos, Christodoulos, et autres
Publié: (2025)
par: Benetatos, Christodoulos, et autres
Publié: (2025)
BabyHuBERT: Multilingual Self-Supervised Learning for Segmenting Speakers in Child-Centered Long-Form Recordings
par: Charlot, Théo, et autres
Publié: (2025)
par: Charlot, Théo, et autres
Publié: (2025)
Windowed SummaryMixing: An Efficient Fine-Tuning of Self-Supervised Learning Models for Low-resource Speech Recognition
par: Menon, Aditya Srinivas, et autres
Publié: (2026)
par: Menon, Aditya Srinivas, et autres
Publié: (2026)
Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
par: Fu, Yonggan, et autres
Publié: (2022)
par: Fu, Yonggan, et autres
Publié: (2022)
SwiftF0: Fast and Accurate Monophonic Pitch Detection
par: Nieradzik, Lars
Publié: (2025)
par: Nieradzik, Lars
Publié: (2025)
Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning
par: Ballas, Aristotelis, et autres
Publié: (2023)
par: Ballas, Aristotelis, et autres
Publié: (2023)
SSPS: Self-Supervised Positive Sampling for Robust Self-Supervised Speaker Verification
par: Lepage, Theo, et autres
Publié: (2025)
par: Lepage, Theo, et autres
Publié: (2025)
Documents similaires
-
Toward Fully Self-Supervised Multi-Pitch Estimation
par: Cwitkowitz, Frank, et autres
Publié: (2024) -
SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription
par: Zang, Yongyi, et autres
Publié: (2023) -
Translation-Equivariant Self-Supervised Learning for Pitch Estimation with Optimal Transport
par: Torres, Bernardo, et autres
Publié: (2025) -
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
par: Chen, Meiying, et autres
Publié: (2022) -
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
par: Yan, Yujia, et autres
Publié: (2024)