:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Wu, Chia-Hua, Ge, Wanying, Wang, Xin, Yamagishi, Junichi, Tsao, Yu, Wang, Hsin-Min
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Sound
Online-Zugang:	https://arxiv.org/abs/2506.14398
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
von: Liu, Xuechen, et al.
Veröffentlicht: (2025)

From Sharpness to Better Generalization for Speech Deepfake Detection
von: Huang, Wen, et al.
Veröffentlicht: (2025)

Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
von: Wang, Xin, et al.
Veröffentlicht: (2025)

SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment
von: Shahzad, Sahibzada Adil, et al.
Veröffentlicht: (2026)

Post-training for Deepfake Speech Detection
von: Ge, Wanying, et al.
Veröffentlicht: (2025)

Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?
von: Wang, Xin, et al.
Veröffentlicht: (2026)

FakeMark: Deepfake Speech Attribution With Watermarked Artifacts
von: Ge, Wanying, et al.
Veröffentlicht: (2025)

Self Voice Conversion as an Attack against Neural Audio Watermarking
von: Özer, Yigitcan, et al.
Veröffentlicht: (2026)

Spoofing attack augmentation: can differently-trained attack models improve generalisation?
von: Ge, Wanying, et al.
Veröffentlicht: (2023)

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
von: Huang, Wen-Chin, et al.
Veröffentlicht: (2024)

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
von: Liu, Xuechen, et al.
Veröffentlicht: (2024)

Zero-Day Audio DeepFake Detection via Retrieval Augmentation and Profile Matching
von: Liu, Xuechen, et al.
Veröffentlicht: (2025)

Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
von: Hashmi, Ammarah, et al.
Veröffentlicht: (2024)

ASVspoof 5: Evaluation of Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
von: Wang, Xin, et al.
Veröffentlicht: (2026)

AVTENet: A Human-Cognition-Inspired Audio-Visual Transformer-Based Ensemble Network for Video Deepfake Detection
von: Hashmi, Ammarah, et al.
Veröffentlicht: (2023)

A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2024)

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
von: Wang, Siyin, et al.
Veröffentlicht: (2025)

A Study on Speech Assessment with Visual Cues
von: Ahmed, Shafique, et al.
Veröffentlicht: (2025)

A Study on Incorporating Whisper for Robust Speech Assessment
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2023)

A Study on Zero-Shot Non-Intrusive Speech Intelligibility for Hearing Aids Using Large Language Models
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2025)

SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
von: Yin, Chun, et al.
Veröffentlicht: (2024)

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
von: Gong, Cheng, et al.
Veröffentlicht: (2023)

ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale
von: Wang, Xin, et al.
Veröffentlicht: (2024)

Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2025)

AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Deepfake Detection of Frontal Face Videos
von: Shahzad, Sahibzada Adil, et al.
Veröffentlicht: (2023)

Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2025)

Deepfake Word Detection by Next-token Prediction using Fine-tuned Whisper
von: Tran, Hoan My, et al.
Veröffentlicht: (2026)

Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2023)

Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music Scores
von: Tang, Jingjing, et al.
Veröffentlicht: (2025)

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches
von: Zeng, Chang, et al.
Veröffentlicht: (2024)

Comparative Analysis of ASR Methods for Speech Deepfake Detection
von: Salvi, Davide, et al.
Veröffentlicht: (2024)

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
von: Ren, Wenze, et al.
Veröffentlicht: (2024)

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
von: Chen, Zhengyang, et al.
Veröffentlicht: (2024)

Mitigating Language Mismatch in SSL-Based Speaker Anonymization
von: Zhang, Zhe, et al.
Veröffentlicht: (2025)

Improving curriculum learning for target speaker extraction with synthetic speakers
von: Liu, Yun, et al.
Veröffentlicht: (2024)

Training Dynamics-Aware Multi-Factor Curriculum Learning for Target Speaker Extraction
von: Liu, Yun, et al.
Veröffentlicht: (2026)

Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
von: Zezario, Ryandhimas E., et al.
Veröffentlicht: (2023)

Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
von: Xie, Yuankun, et al.
Veröffentlicht: (2025)

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
von: Wang, Xin, et al.
Veröffentlicht: (2024)

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
von: Chen, Chih-Ning, et al.
Veröffentlicht: (2026)