Saved in:
Bibliographic Details
Main Authors: Bhattacharjee, Anindya, Islam, Kaidul, Anan, Kafi, Intesher, Ashir, Fuad, Abrar Assaeem, Saha, Utsab, Imtiaz, Hafiz
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.10682
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • The spread of deepfakes poses significant security concerns, demanding reliable detection methods. However, diverse generation techniques and class imbalance in datasets create challenges. We propose CAE-Net, a Convolution- and Attention-based weighted Ensemble network combining spatial and frequency-domain features for effective deepfake detection. The architecture integrates EfficientNet, Data-Efficient Image Transformer (DeiT), and ConvNeXt with wavelet features to learn complementary representations. We evaluated CAE-Net on the diverse IEEE Signal Processing Cup 2025 (DF-Wild Cup) dataset, which has a 5:1 fake-to-real class imbalance. To address this, we introduce a multistage disjoint-subset training strategy, sequentially training the model on non-overlapping subsets of the fake class while retaining knowledge across stages. Our approach achieved $94.46\%$ accuracy and a $97.60\%$ AUC, outperforming conventional class-balancing methods. Visualizations confirm the network focuses on meaningful facial regions, and our ensemble design demonstrates robustness against adversarial attacks, positioning CAE-Net as a dependable and generalized deepfake detection framework.