:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Chaomeng, Yu, Zitong, Dong, Junhao, Su, Sen, Shen, Linlin, Xia, Shutao, Cao, Xiaochun
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.01851
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models
by: Lai, Yingxin, et al.
Published: (2026)

Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection
by: Lai, Yingxin, et al.
Published: (2025)

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
by: Ye, Qilang, et al.
Published: (2024)

IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2025)

GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)

Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024)

Marginal Debiased Network for Fair Visual Recognition
by: Wang, Mei, et al.
Published: (2024)

HalluCXR: Benchmarking and Mitigating Hallucinations in Medical Vision-Language Models for Chest Radiograph Interpretation
by: Wang, Haoyu, et al.
Published: (2026)

Text-Guided Multimodal Unified Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2026)

SFDA-rPPG: Source-Free Domain Adaptive Remote Physiological Measurement with Spatio-Temporal Consistency
by: Xie, Yiping, et al.
Published: (2024)

BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing
by: Ma, Yingjie, et al.
Published: (2024)

FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models
by: Wang, Hongyang, et al.
Published: (2025)

SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models
by: Shi, Yichen, et al.
Published: (2024)

GM-DF: Generalized Multi-Scenario Deepfake Detection
by: Lai, Yingxin, et al.
Published: (2024)

MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024)

PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing
by: Xie, Yiping, et al.
Published: (2025)

Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
by: Jin, Peng, et al.
Published: (2023)

Adversarial Backdoor Defense in CLIP
by: Kuang, Junhao, et al.
Published: (2024)

IKOD: Mitigating Visual Attention Degradation in Large Vision-Language Models
by: Yang, Jiabing, et al.
Published: (2025)

Dynamic Analysis and Adaptive Discriminator for Fake News Detection
by: Su, Xinqi, et al.
Published: (2024)

Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing
by: Ma, Yingjie, et al.
Published: (2025)

GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning
by: Zhang, Yaning, et al.
Published: (2024)

DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
by: Yang, Jingyi, et al.
Published: (2025)

VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models
by: Liang, Jiawei, et al.
Published: (2024)

Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
by: Sun, Yinan, et al.
Published: (2025)

Token-Level Entropy Reveals Demographic Disparities in Language Models
by: Lee, Messi H. J.
Published: (2025)

Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning
by: Zhu, Hongze, et al.
Published: (2024)

Learning Representation and Synergy Invariances: A Povable Framework for Generalized Multimodal Face Anti-Spoofing
by: Lin, Xun, et al.
Published: (2025)

Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection
by: Liang, Jiawei, et al.
Published: (2024)

SGHA-Attack: Semantic-Guided Hierarchical Alignment for Transferable Targeted Attacks on Vision-Language Models
by: Wang, Haobo, et al.
Published: (2026)

Answering Diverse Questions via Text Attached with Key Audio-Visual Clues
by: Ye, Qilang, et al.
Published: (2024)

Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias
by: Tian, Bowei, et al.
Published: (2024)

FairFedMed: Benchmarking Group Fairness in Federated Medical Imaging with FairLoRA
by: Li, Minghan, et al.
Published: (2025)

Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats
by: Liu, Kuanrong, et al.
Published: (2024)

Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling
by: Niu, Zenghao, et al.
Published: (2025)

FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
by: Slyman, Eric, et al.
Published: (2024)

StegaVAR: Privacy-Preserving Video Action Recognition via Steganographic Domain Analysis
by: Chen, Lixin, et al.
Published: (2025)

TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model
by: Wang, Yuhao, et al.
Published: (2024)

DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning
by: Huang, Jiajian, et al.
Published: (2026)

PhaseWin Search Framework Enable Efficient Object-Level Interpretation
by: Gu, Zihan, et al.
Published: (2025)