Saved in:
| Main Authors: | Wang, Yuxiang, Zhang, You, Duan, Zhiyao, Bocko, Mark |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2207.14352 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
by: Zhang, You, et al.
Published: (2022)
by: Zhang, You, et al.
Published: (2022)
Towards Perception-Informed Latent HRTF Representations
by: Zhang, You, et al.
Published: (2025)
by: Zhang, You, et al.
Published: (2025)
PartialEdit: Identifying Partial Deepfakes in the Era of Neural Speech Editing
by: Zhang, You, et al.
Published: (2025)
by: Zhang, You, et al.
Published: (2025)
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
by: Chen, Xinhui, et al.
Published: (2021)
by: Chen, Xinhui, et al.
Published: (2021)
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
by: Zhang, You, et al.
Published: (2021)
by: Zhang, You, et al.
Published: (2021)
A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
by: Lee, Kyungbok, et al.
Published: (2024)
by: Lee, Kyungbok, et al.
Published: (2024)
ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed
by: Chen, Meiying, et al.
Published: (2022)
by: Chen, Meiying, et al.
Published: (2022)
Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)
by: Chen, Meiying Melissa, et al.
Published: (2025)
A Data-Driven Exploration of Elevation Cues in HRTFs: An Explainable AI Perspective Across Multiple Datasets
by: De Rus, Juan Antonio, et al.
Published: (2025)
by: De Rus, Juan Antonio, et al.
Published: (2025)
SingFake: Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2023)
by: Zang, Yongyi, et al.
Published: (2023)
Cacophony: An Improved Contrastive Audio-Text Model
by: Zhu, Ge, et al.
Published: (2024)
by: Zhu, Ge, et al.
Published: (2024)
Audio Generation Through Score-Based Generative Modeling: Design Principles and Implementation
by: Zhu, Ge, et al.
Published: (2025)
by: Zhu, Ge, et al.
Published: (2025)
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge
by: Zhang, You, et al.
Published: (2024)
by: Zhang, You, et al.
Published: (2024)
Scoring Time Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
by: Yan, Yujia, et al.
Published: (2024)
by: Yan, Yujia, et al.
Published: (2024)
Toward Fully Self-Supervised Multi-Pitch Estimation
by: Cwitkowitz, Frank, et al.
Published: (2024)
by: Cwitkowitz, Frank, et al.
Published: (2024)
Investigating an Overfitting and Degeneration Phenomenon in Self-Supervised Multi-Pitch Estimation
by: Cwitkowitz, Frank, et al.
Published: (2025)
by: Cwitkowitz, Frank, et al.
Published: (2025)
EchoScan: Scanning Complex Room Geometries via Acoustic Echoes
by: Yeon, Inmo, et al.
Published: (2023)
by: Yeon, Inmo, et al.
Published: (2023)
Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion
by: Zhang, Yu, et al.
Published: (2025)
by: Zhang, Yu, et al.
Published: (2025)
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
by: Wang, Shuai, et al.
Published: (2024)
by: Wang, Shuai, et al.
Published: (2024)
Compositional Audio Representation Learning
by: Sridhar, Sripathi, et al.
Published: (2024)
by: Sridhar, Sripathi, et al.
Published: (2024)
Head-Related Transfer Function Individualization Using Anthropometric Features and Spatially Independent Latent Representation
by: Niu, Ryan, et al.
Published: (2025)
by: Niu, Ryan, et al.
Published: (2025)
MusicHiFi: Fast High-Fidelity Stereo Vocoding
by: Zhu, Ge, et al.
Published: (2024)
by: Zhu, Ge, et al.
Published: (2024)
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
by: Cornell, Samuele, et al.
Published: (2024)
by: Cornell, Samuele, et al.
Published: (2024)
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2024)
by: Zang, Yongyi, et al.
Published: (2024)
Sound Field Reconstruction Using a Compact Acoustics-informed Neural Network
by: Ma, Fei, et al.
Published: (2024)
by: Ma, Fei, et al.
Published: (2024)
Head Orientation Estimation with Distributed Microphones Using Speech Radiation Patterns
by: Müller, Kaspar, et al.
Published: (2023)
by: Müller, Kaspar, et al.
Published: (2023)
Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval
by: Deng, Yimin, et al.
Published: (2024)
by: Deng, Yimin, et al.
Published: (2024)
Improving Short Utterance Anti-Spoofing with AASIST2
by: Zhang, Yuxiang, et al.
Published: (2023)
by: Zhang, Yuxiang, et al.
Published: (2023)
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection
by: Hu, Jinbo, et al.
Published: (2023)
by: Hu, Jinbo, et al.
Published: (2023)
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
by: Zhou, Enting, et al.
Published: (2023)
by: Zhou, Enting, et al.
Published: (2023)
Deep Speech Synthesis from Multimodal Articulatory Representations
by: Wu, Peter, et al.
Published: (2024)
by: Wu, Peter, et al.
Published: (2024)
Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision
by: Chen, Yafeng, et al.
Published: (2024)
by: Chen, Yafeng, et al.
Published: (2024)
Self-Distillation Prototypes Network: Learning Robust Speaker Representations without Supervision
by: Chen, Yafeng, et al.
Published: (2023)
by: Chen, Yafeng, et al.
Published: (2023)
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
by: Zhang, You, et al.
Published: (2024)
by: Zhang, You, et al.
Published: (2024)
Learning Expressive Disentangled Speech Representations with Soft Speech Units and Adversarial Style Augmentation
by: Deng, Yimin, et al.
Published: (2024)
by: Deng, Yimin, et al.
Published: (2024)
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
by: Gao, Yingxue, et al.
Published: (2024)
by: Gao, Yingxue, et al.
Published: (2024)
Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)
by: Wang, Tianrui, et al.
Published: (2024)
Feasibility of Mental Health Triage Call Priority Prediction Using Machine Learning
by: Rana, Rajib, et al.
Published: (2024)
by: Rana, Rajib, et al.
Published: (2024)
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
by: Fu, Ruibo, et al.
Published: (2024)
by: Fu, Ruibo, et al.
Published: (2024)
3D Room Geometry Inference from Multichannel Room Impulse Response using Deep Neural Network
by: Yeon, Inmo, et al.
Published: (2024)
by: Yeon, Inmo, et al.
Published: (2024)
Similar Items
-
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
by: Zhang, You, et al.
Published: (2022) -
Towards Perception-Informed Latent HRTF Representations
by: Zhang, You, et al.
Published: (2025) -
PartialEdit: Identifying Partial Deepfakes in the Era of Neural Speech Editing
by: Zhang, You, et al.
Published: (2025) -
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
by: Chen, Xinhui, et al.
Published: (2021) -
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
by: Zhang, You, et al.
Published: (2021)