Saved in:
| Main Authors: | Gu, Hao, Yi, JiangYan, Wang, Chenglong, Ren, Yong, Tao, Jianhua, Yan, Xinrui, Chen, Yujie, Zhang, Xiaohui |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.17009 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
by: Yi, Jiangyan, et al.
Published: (2024)
by: Yi, Jiangyan, et al.
Published: (2024)
Region-Based Optimization in Continual Learning for Audio Deepfake Detection
by: Chen, Yujie, et al.
Published: (2024)
by: Chen, Yujie, et al.
Published: (2024)
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022)
by: Zhao, Yan, et al.
Published: (2022)
Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
by: Yan, Xinrui, et al.
Published: (2024)
by: Yan, Xinrui, et al.
Published: (2024)
ALLM4ADD: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection
by: Gu, Hao, et al.
Published: (2025)
by: Gu, Hao, et al.
Published: (2025)
Audio Deepfake Attribution: An Initial Dataset and Investigation
by: Yan, Xinrui, et al.
Published: (2022)
by: Yan, Xinrui, et al.
Published: (2022)
Distinguishing Neural Speech Synthesis Models Through Fingerprints in Speech Waveforms
by: Zhang, Chu Yuan, et al.
Published: (2023)
by: Zhang, Chu Yuan, et al.
Published: (2023)
RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection
by: Chen, Yujie, et al.
Published: (2024)
by: Chen, Yujie, et al.
Published: (2024)
Towards Robust Audio Deepfake Detection: A Evolving Benchmark for Continual Learning
by: Zhang, Xiaohui, et al.
Published: (2024)
by: Zhang, Xiaohui, et al.
Published: (2024)
An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio
by: Zeng, Siding, et al.
Published: (2024)
by: Zeng, Siding, et al.
Published: (2024)
Residual Speaker Representation for One-Shot Voice Conversion
by: Xu, Le, et al.
Published: (2023)
by: Xu, Le, et al.
Published: (2023)
ADD 2022: the First Audio Deep Synthesis Detection Challenge
by: Yi, Jiangyan, et al.
Published: (2022)
by: Yi, Jiangyan, et al.
Published: (2022)
Edit Content, Preserve Acoustics: Imperceptible Text-Based Speech Editing via Self-Consistency Rewards
by: Ren, Yong, et al.
Published: (2026)
by: Ren, Yong, et al.
Published: (2026)
SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection
by: Yi, Jiangyan, et al.
Published: (2022)
by: Yi, Jiangyan, et al.
Published: (2022)
OV-InstructTTS: Towards Open-Vocabulary Instruct Text-to-Speech
by: Ren, Yong, et al.
Published: (2026)
by: Ren, Yong, et al.
Published: (2026)
Spatial Reconstructed Local Attention Res2Net with F0 Subband for Fake Speech Detection
by: Fan, Cunhang, et al.
Published: (2023)
by: Fan, Cunhang, et al.
Published: (2023)
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
by: Tao, Ruijie, et al.
Published: (2024)
by: Tao, Ruijie, et al.
Published: (2024)
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
by: Fu, Ruibo, et al.
Published: (2024)
by: Fu, Ruibo, et al.
Published: (2024)
Unified Audio Event Detection
by: Jiang, Yidi, et al.
Published: (2024)
by: Jiang, Yidi, et al.
Published: (2024)
Can Audio Large Language Models Verify Speaker Identity?
by: Ren, Yiming, et al.
Published: (2025)
by: Ren, Yiming, et al.
Published: (2025)
WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
by: Zhou, Junzuo, et al.
Published: (2024)
by: Zhou, Junzuo, et al.
Published: (2024)
Profile-Error-Tolerant Target-Speaker Voice Activity Detection
by: Wang, Dongmei, et al.
Published: (2023)
by: Wang, Dongmei, et al.
Published: (2023)
Fewer-token Neural Speech Codec with Time-invariant Codes
by: Ren, Yong, et al.
Published: (2023)
by: Ren, Yong, et al.
Published: (2023)
Review of MEMS Speakers for Audio Applications
by: Wittek, Nils, et al.
Published: (2025)
by: Wittek, Nils, et al.
Published: (2025)
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model
by: Ren, Yong, et al.
Published: (2025)
by: Ren, Yong, et al.
Published: (2025)
Generalized Fake Audio Detection via Deep Stable Learning
by: Wang, Zhiyong, et al.
Published: (2024)
by: Wang, Zhiyong, et al.
Published: (2024)
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
by: Li, Zixuan, et al.
Published: (2025)
by: Li, Zixuan, et al.
Published: (2025)
A Noval Feature via Color Quantisation for Fake Audio Detection
by: Wang, Zhiyong, et al.
Published: (2024)
by: Wang, Zhiyong, et al.
Published: (2024)
Online Audio-Visual Autoregressive Speaker Extraction
by: Pan, Zexu, et al.
Published: (2025)
by: Pan, Zexu, et al.
Published: (2025)
Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
by: Wang, Xiaopeng, et al.
Published: (2024)
by: Wang, Xiaopeng, et al.
Published: (2024)
TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking
by: Zhou, Junzuo, et al.
Published: (2024)
by: Zhou, Junzuo, et al.
Published: (2024)
Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025)
by: Vijaykumar, Rahul, et al.
Published: (2025)
Speech to Speech Synthesis for Voice Impersonation
by: Johnson, Bjorn, et al.
Published: (2026)
by: Johnson, Bjorn, et al.
Published: (2026)
RPRA-ADD: Forgery Trace Enhancement-Driven Audio Deepfake Detection
by: Fu, Ruibo, et al.
Published: (2025)
by: Fu, Ruibo, et al.
Published: (2025)
Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
by: Wang, Zhiyong, et al.
Published: (2024)
by: Wang, Zhiyong, et al.
Published: (2024)
Speaker Distance Estimation in Enclosures from Single-Channel Audio
by: Neri, Michael, et al.
Published: (2024)
by: Neri, Michael, et al.
Published: (2024)
Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
by: Zhao, Jinzheng, et al.
Published: (2023)
by: Zhao, Jinzheng, et al.
Published: (2023)
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)
by: Li, Guinan, et al.
Published: (2024)
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection
by: Cai, Pengfei, et al.
Published: (2024)
by: Cai, Pengfei, et al.
Published: (2024)
From Contrast to Commonality: Audio Commonality Captioning for Enhanced Audio-Text Cross-modal Understanding in Multimodal LLMs
by: Jia, Yuhang, et al.
Published: (2025)
by: Jia, Yuhang, et al.
Published: (2025)
Similar Items
-
ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
by: Yi, Jiangyan, et al.
Published: (2024) -
Region-Based Optimization in Continual Learning for Audio Deepfake Detection
by: Chen, Yujie, et al.
Published: (2024) -
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022) -
Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio
by: Yan, Xinrui, et al.
Published: (2024) -
ALLM4ADD: Unlocking the Capabilities of Audio Large Language Models for Audio Deepfake Detection
by: Gu, Hao, et al.
Published: (2025)