Salvato in:
Dettagli Bibliografici
Autori principali: Ahmad, Wasim, Peng, Yan-Tsung, Chang, Yuan-Hao, Ganfure, Gaddisa Olani, Khan, Sarwar
Natura: Preprint
Pubblicazione: 2023
Soggetti:
Accesso online:https://arxiv.org/abs/2311.03782
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866909646350974976
author Ahmad, Wasim
Peng, Yan-Tsung
Chang, Yuan-Hao
Ganfure, Gaddisa Olani
Khan, Sarwar
author_facet Ahmad, Wasim
Peng, Yan-Tsung
Chang, Yuan-Hao
Ganfure, Gaddisa Olani
Khan, Sarwar
contents Deep-fake videos, generated through AI face-swapping techniques, have gained significant attention due to their potential for impactful impersonation attacks. While most research focuses on real vs. fake detection, attributing a deep-fake to its specific generation model or encoder is vital for forensic analysis, enabling source tracing and tailored countermeasures. This enhances detection by leveraging model-specific artifacts and supports proactive defenses. We investigate the model attribution problem for deep-fake videos using two datasets: Deepfakes from Different Models (DFDM) and GANGen-Detection, both comprising deep-fake videos and GAN-generated images. We use only fake images from GANGen-Detection to align with DFDM's focus on attribution rather than binary classification. We formulate the task as a multiclass classification problem and introduce a novel Capsule-Spatial-Temporal (CapST) model that integrates a truncated VGG19 network for feature extraction, capsule networks for hierarchical encoding, and a spatio-temporal attention mechanism. Video-level fusion captures temporal dependencies across frames. Experiments on DFDM and GANGen-Detection show CapST outperforms baseline models in attribution accuracy while reducing computational cost.
format Preprint
id arxiv_https___arxiv_org_abs_2311_03782
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos
Ahmad, Wasim
Peng, Yan-Tsung
Chang, Yuan-Hao
Ganfure, Gaddisa Olani
Khan, Sarwar
Computer Vision and Pattern Recognition
acmart
Deep-fake videos, generated through AI face-swapping techniques, have gained significant attention due to their potential for impactful impersonation attacks. While most research focuses on real vs. fake detection, attributing a deep-fake to its specific generation model or encoder is vital for forensic analysis, enabling source tracing and tailored countermeasures. This enhances detection by leveraging model-specific artifacts and supports proactive defenses. We investigate the model attribution problem for deep-fake videos using two datasets: Deepfakes from Different Models (DFDM) and GANGen-Detection, both comprising deep-fake videos and GAN-generated images. We use only fake images from GANGen-Detection to align with DFDM's focus on attribution rather than binary classification. We formulate the task as a multiclass classification problem and introduce a novel Capsule-Spatial-Temporal (CapST) model that integrates a truncated VGG19 network for feature extraction, capsule networks for hierarchical encoding, and a spatio-temporal attention mechanism. Video-level fusion captures temporal dependencies across frames. Experiments on DFDM and GANGen-Detection show CapST outperforms baseline models in attribution accuracy while reducing computational cost.
title CapST: Leveraging Capsule Networks and Temporal Attention for Accurate Model Attribution in Deep-fake Videos
topic Computer Vision and Pattern Recognition
acmart
url https://arxiv.org/abs/2311.03782