:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Jingwei, Tong, Jiaxin, Wu, Pengfei
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.15903
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Explainable Deepfake Detection with RL Enhanced Self-Blended Images
by: Jiang, Ning, et al.
Published: (2026)

FSBI: Deepfakes Detection with Frequency Enhanced Self-Blended Images
by: Hasanaath, Ahmed Abul, et al.
Published: (2024)

Extending CLIP's Image-Text Alignment to Referring Image Segmentation
by: Kim, Seoyeon, et al.
Published: (2023)

VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
by: Zhu, Wencheng, et al.
Published: (2025)

The Alpha Blending Hypothesis: Compositing Shortcut in Deepfake Detection
by: Yermakov, Andrii, et al.
Published: (2026)

Unlocking the Hidden Potential of CLIP in Generalizable Deepfake Detection
by: Yermakov, Andrii, et al.
Published: (2025)

C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection
by: Tan, Chuangchuang, et al.
Published: (2024)

Capture Artifacts via Progressive Disentangling and Purifying Blended Identities for Deepfake Detection
by: Zhou, Weijie, et al.
Published: (2024)

Enhancing Multimodal Understanding with CLIP-Based Image-to-Text Transformation
by: Che, Chang, et al.
Published: (2024)

GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)

Contrast-Aware Calibration for Fine-Tuned CLIP: Leveraging Image-Text Alignment
by: Lv, Song-Lin, et al.
Published: (2025)

MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
by: Das, Anurag, et al.
Published: (2024)

TTD: Text-Tag Self-Distillation Enhancing Image-Text Alignment in CLIP to Alleviate Single Tag Bias
by: Jo, Sanghyun, et al.
Published: (2024)

MARBLE: Material Recomposition and Blending in CLIP-Space
by: Cheng, Ta-Ying, et al.
Published: (2025)

Generalizing Deepfake Video Detection with Plug-and-Play: Video-Level Blending and Spatiotemporal Adapter Tuning
by: Yan, Zhiyuan, et al.
Published: (2024)

LLMs Are Not Yet Ready for Deepfake Image Detection
by: Tariq, Shahroz, et al.
Published: (2025)

Data-Driven Deepfake Image Detection Method -- The 2024 Global Deepfake Image Detection Challenge
by: Zhu, Xiaoya, et al.
Published: (2025)

IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts
by: Wang, Juan, et al.
Published: (2026)

Blending Concepts with Text-to-Image Diffusion Models
by: Olearo, Lorenzo, et al.
Published: (2025)

Toward Medical Deepfake Detection: A Comprehensive Dataset and Novel Method
by: Li, Shuaibo, et al.
Published: (2025)

CLIP-IT: CLIP-based Pairing for Histology Images Classification
by: Karimian, Banafsheh, et al.
Published: (2025)

Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
by: Huang, Hailang, et al.
Published: (2024)

Text-to-Image Generation Via Energy-Based CLIP
by: Ganz, Roy, et al.
Published: (2024)

GC-ConsFlow: Leveraging Optical Flow Residuals and Global Context for Robust Deepfake Detection
by: Chen, Jiaxin, et al.
Published: (2025)

Pre-training CLIP against Data Poisoning with Optimal Transport-based Matching and Alignment
by: Zhang, Tong, et al.
Published: (2025)

CLIP3D-AD: Extending CLIP for 3D Few-Shot Anomaly Detection with Multi-View Images Generation
by: Zuo, Zuo, et al.
Published: (2024)

MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection
by: Zhang, Ximiao, et al.
Published: (2024)

$β$-CLIP: Text-Conditioned Contrastive Learning for Multi-Granular Vision-Language Alignment
by: Zohra, Fatimah, et al.
Published: (2025)

EntityCLIP: Entity-Centric Image-Text Matching via Multimodal Attentive Contrastive Learning
by: Wang, Yaxiong, et al.
Published: (2024)

Language-Image Alignment with Fixed Text Encoders
by: Yang, Jingfeng, et al.
Published: (2025)

Long-CLIP: Unlocking the Long-Text Capability of CLIP
by: Zhang, Beichen, et al.
Published: (2024)

IBURD: Image Blending for Underwater Robotic Detection
by: Hong, Jungseok, et al.
Published: (2025)

Evidence Packing for Cross-Domain Image Deepfake Detection with LVLMs
by: Liu, Yuxin, et al.
Published: (2026)

LAIP: Learning Local Alignment from Image-Phrase Modeling for Text-based Person Search
by: Wang, Haiguang, et al.
Published: (2024)

Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection
by: Zhang, Zhaoxiang, et al.
Published: (2024)

Text and Image Are Mutually Beneficial: Enhancing Training-Free Few-Shot Classification with CLIP
by: Li, Yayuan, et al.
Published: (2024)

CalibCLIP: Contextual Calibration of Dominant Semantics for Text-Driven Image Retrieval
by: Kang, Bin, et al.
Published: (2025)

La-SoftMoE CLIP for Unified Physical-Digital Face Attack Detection
by: Zou, Hang, et al.
Published: (2024)

CPN: Complementary Proposal Network for Unconstrained Text Detection
by: Wu, Longhuang, et al.
Published: (2024)

Free Lunch Alignment of Text-to-Image Diffusion Models without Preference Image Pairs
by: Xian, Jia Jun Cheng, et al.
Published: (2025)