Saved in:
Bibliographic Details
Main Authors: Bhattacharjee, Anindya, Islam, Kaidul, Anan, Kafi, Intesher, Ashir, Fuad, Abrar Assaeem, Saha, Utsab, Imtiaz, Hafiz
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.10682
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914219389091840
author Bhattacharjee, Anindya
Islam, Kaidul
Anan, Kafi
Intesher, Ashir
Fuad, Abrar Assaeem
Saha, Utsab
Imtiaz, Hafiz
author_facet Bhattacharjee, Anindya
Islam, Kaidul
Anan, Kafi
Intesher, Ashir
Fuad, Abrar Assaeem
Saha, Utsab
Imtiaz, Hafiz
contents The spread of deepfakes poses significant security concerns, demanding reliable detection methods. However, diverse generation techniques and class imbalance in datasets create challenges. We propose CAE-Net, a Convolution- and Attention-based weighted Ensemble network combining spatial and frequency-domain features for effective deepfake detection. The architecture integrates EfficientNet, Data-Efficient Image Transformer (DeiT), and ConvNeXt with wavelet features to learn complementary representations. We evaluated CAE-Net on the diverse IEEE Signal Processing Cup 2025 (DF-Wild Cup) dataset, which has a 5:1 fake-to-real class imbalance. To address this, we introduce a multistage disjoint-subset training strategy, sequentially training the model on non-overlapping subsets of the fake class while retaining knowledge across stages. Our approach achieved $94.46\%$ accuracy and a $97.60\%$ AUC, outperforming conventional class-balancing methods. Visualizations confirm the network focuses on meaningful facial regions, and our ensemble design demonstrates robustness against adversarial attacks, positioning CAE-Net as a dependable and generalized deepfake detection framework.
format Preprint
id arxiv_https___arxiv_org_abs_2502_10682
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle CAE-Net: Generalized Deepfake Image Detection using Convolution and Attention Mechanisms with Spatial and Frequency Domain Features
Bhattacharjee, Anindya
Islam, Kaidul
Anan, Kafi
Intesher, Ashir
Fuad, Abrar Assaeem
Saha, Utsab
Imtiaz, Hafiz
Computer Vision and Pattern Recognition
Machine Learning
Image and Video Processing
The spread of deepfakes poses significant security concerns, demanding reliable detection methods. However, diverse generation techniques and class imbalance in datasets create challenges. We propose CAE-Net, a Convolution- and Attention-based weighted Ensemble network combining spatial and frequency-domain features for effective deepfake detection. The architecture integrates EfficientNet, Data-Efficient Image Transformer (DeiT), and ConvNeXt with wavelet features to learn complementary representations. We evaluated CAE-Net on the diverse IEEE Signal Processing Cup 2025 (DF-Wild Cup) dataset, which has a 5:1 fake-to-real class imbalance. To address this, we introduce a multistage disjoint-subset training strategy, sequentially training the model on non-overlapping subsets of the fake class while retaining knowledge across stages. Our approach achieved $94.46\%$ accuracy and a $97.60\%$ AUC, outperforming conventional class-balancing methods. Visualizations confirm the network focuses on meaningful facial regions, and our ensemble design demonstrates robustness against adversarial attacks, positioning CAE-Net as a dependable and generalized deepfake detection framework.
title CAE-Net: Generalized Deepfake Image Detection using Convolution and Attention Mechanisms with Spatial and Frequency Domain Features
topic Computer Vision and Pattern Recognition
Machine Learning
Image and Video Processing
url https://arxiv.org/abs/2502.10682