Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Xu, Ying, Pedersen, Marius, Raja, Kiran
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.07607
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915189906997248
author	Xu, Ying Pedersen, Marius Raja, Kiran
author_facet	Xu, Ying Pedersen, Marius Raja, Kiran
contents	The rapid development of deep learning and generative AI technologies has profoundly transformed the digital contact landscape, creating realistic Deepfake that poses substantial challenges to public trust and digital media integrity. This paper introduces a novel Deepfake detention framework, Volume of Differences (VoD), designed to enhance detection accuracy by exploiting temporal and spatial inconsistencies between consecutive video frames. VoD employs a progressive learning approach that captures differences across multiple axes through the use of consecutive frame differences (CFD) and a network with stepwise expansions. We evaluate our approach with intra-dataset and cross-dataset testing scenarios on various well-known Deepfake datasets. Our findings demonstrate that VoD excels with the data it has been trained on and shows strong adaptability to novel, unseen data. Additionally, comprehensive ablation studies examine various configurations of segment length, sampling steps, and intervals, offering valuable insights for optimizing the framework. The code for our VoD framework is available at https://github.com/xuyingzhongguo/VoD.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_07607
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	VoD: Learning Volume of Differences for Video-Based Deepfake Detection Xu, Ying Pedersen, Marius Raja, Kiran Computer Vision and Pattern Recognition The rapid development of deep learning and generative AI technologies has profoundly transformed the digital contact landscape, creating realistic Deepfake that poses substantial challenges to public trust and digital media integrity. This paper introduces a novel Deepfake detention framework, Volume of Differences (VoD), designed to enhance detection accuracy by exploiting temporal and spatial inconsistencies between consecutive video frames. VoD employs a progressive learning approach that captures differences across multiple axes through the use of consecutive frame differences (CFD) and a network with stepwise expansions. We evaluate our approach with intra-dataset and cross-dataset testing scenarios on various well-known Deepfake datasets. Our findings demonstrate that VoD excels with the data it has been trained on and shows strong adaptability to novel, unseen data. Additionally, comprehensive ablation studies examine various configurations of segment length, sampling steps, and intervals, offering valuable insights for optimizing the framework. The code for our VoD framework is available at https://github.com/xuyingzhongguo/VoD.
title	VoD: Learning Volume of Differences for Video-Based Deepfake Detection
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2503.07607

Similar Items