Saved in:
| Main Authors: | Chen, Chaomeng, Yu, Zitong, Dong, Junhao, Su, Sen, Shen, Linlin, Xia, Shutao, Cao, Xiaochun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.01851 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models
by: Lai, Yingxin, et al.
Published: (2026)
by: Lai, Yingxin, et al.
Published: (2026)
Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection
by: Lai, Yingxin, et al.
Published: (2025)
by: Lai, Yingxin, et al.
Published: (2025)
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
by: Ye, Qilang, et al.
Published: (2024)
by: Ye, Qilang, et al.
Published: (2024)
IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2025)
by: Li, Zewen, et al.
Published: (2025)
GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)
by: Zhang, Yaning, et al.
Published: (2026)
Distilled Transformers with Locally Enhanced Global Representations for Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024)
by: Zhang, Yaning, et al.
Published: (2024)
Marginal Debiased Network for Fair Visual Recognition
by: Wang, Mei, et al.
Published: (2024)
by: Wang, Mei, et al.
Published: (2024)
HalluCXR: Benchmarking and Mitigating Hallucinations in Medical Vision-Language Models for Chest Radiograph Interpretation
by: Wang, Haoyu, et al.
Published: (2026)
by: Wang, Haoyu, et al.
Published: (2026)
Text-Guided Multimodal Unified Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2026)
by: Li, Zewen, et al.
Published: (2026)
SFDA-rPPG: Source-Free Domain Adaptive Remote Physiological Measurement with Spatio-Temporal Consistency
by: Xie, Yiping, et al.
Published: (2024)
by: Xie, Yiping, et al.
Published: (2024)
BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing
by: Ma, Yingjie, et al.
Published: (2024)
by: Ma, Yingjie, et al.
Published: (2024)
FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models
by: Wang, Hongyang, et al.
Published: (2025)
by: Wang, Hongyang, et al.
Published: (2025)
SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models
by: Shi, Yichen, et al.
Published: (2024)
by: Shi, Yichen, et al.
Published: (2024)
GM-DF: Generalized Multi-Scenario Deepfake Detection
by: Lai, Yingxin, et al.
Published: (2024)
by: Lai, Yingxin, et al.
Published: (2024)
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024)
by: Zhang, Yaning, et al.
Published: (2024)
PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing
by: Xie, Yiping, et al.
Published: (2025)
by: Xie, Yiping, et al.
Published: (2025)
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
by: Jin, Peng, et al.
Published: (2023)
by: Jin, Peng, et al.
Published: (2023)
Adversarial Backdoor Defense in CLIP
by: Kuang, Junhao, et al.
Published: (2024)
by: Kuang, Junhao, et al.
Published: (2024)
IKOD: Mitigating Visual Attention Degradation in Large Vision-Language Models
by: Yang, Jiabing, et al.
Published: (2025)
by: Yang, Jiabing, et al.
Published: (2025)
Dynamic Analysis and Adaptive Discriminator for Fake News Detection
by: Su, Xinqi, et al.
Published: (2024)
by: Su, Xinqi, et al.
Published: (2024)
Denoising and Alignment: Rethinking Domain Generalization for Multimodal Face Anti-Spoofing
by: Ma, Yingjie, et al.
Published: (2025)
by: Ma, Yingjie, et al.
Published: (2025)
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning
by: Zhang, Yaning, et al.
Published: (2024)
by: Zhang, Yaning, et al.
Published: (2024)
DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing
by: Yang, Jingyi, et al.
Published: (2025)
by: Yang, Jingyi, et al.
Published: (2025)
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models
by: Liang, Jiawei, et al.
Published: (2024)
by: Liang, Jiawei, et al.
Published: (2024)
Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
by: Sun, Yinan, et al.
Published: (2025)
by: Sun, Yinan, et al.
Published: (2025)
Token-Level Entropy Reveals Demographic Disparities in Language Models
by: Lee, Messi H. J.
Published: (2025)
by: Lee, Messi H. J.
Published: (2025)
Towards High-resolution 3D Anomaly Detection via Group-Level Feature Contrastive Learning
by: Zhu, Hongze, et al.
Published: (2024)
by: Zhu, Hongze, et al.
Published: (2024)
Learning Representation and Synergy Invariances: A Povable Framework for Generalized Multimodal Face Anti-Spoofing
by: Lin, Xun, et al.
Published: (2025)
by: Lin, Xun, et al.
Published: (2025)
Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery Detection
by: Liang, Jiawei, et al.
Published: (2024)
by: Liang, Jiawei, et al.
Published: (2024)
SGHA-Attack: Semantic-Guided Hierarchical Alignment for Transferable Targeted Attacks on Vision-Language Models
by: Wang, Haobo, et al.
Published: (2026)
by: Wang, Haobo, et al.
Published: (2026)
Answering Diverse Questions via Text Attached with Key Audio-Visual Clues
by: Ye, Qilang, et al.
Published: (2024)
by: Ye, Qilang, et al.
Published: (2024)
Fair Diagnosis: Leveraging Causal Modeling to Mitigate Medical Bias
by: Tian, Bowei, et al.
Published: (2024)
by: Tian, Bowei, et al.
Published: (2024)
FairFedMed: Benchmarking Group Fairness in Federated Medical Imaging with FairLoRA
by: Li, Minghan, et al.
Published: (2025)
by: Li, Minghan, et al.
Published: (2025)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats
by: Liu, Kuanrong, et al.
Published: (2024)
by: Liu, Kuanrong, et al.
Published: (2024)
Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling
by: Niu, Zenghao, et al.
Published: (2025)
by: Niu, Zenghao, et al.
Published: (2025)
FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
by: Slyman, Eric, et al.
Published: (2024)
by: Slyman, Eric, et al.
Published: (2024)
StegaVAR: Privacy-Preserving Video Action Recognition via Steganographic Domain Analysis
by: Chen, Lixin, et al.
Published: (2025)
by: Chen, Lixin, et al.
Published: (2025)
TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model
by: Wang, Yuhao, et al.
Published: (2024)
by: Wang, Yuhao, et al.
Published: (2024)
DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning
by: Huang, Jiajian, et al.
Published: (2026)
by: Huang, Jiajian, et al.
Published: (2026)
PhaseWin Search Framework Enable Efficient Object-Level Interpretation
by: Gu, Zihan, et al.
Published: (2025)
by: Gu, Zihan, et al.
Published: (2025)
Similar Items
-
ForensicZip: More Tokens are Better but Not Necessary in Forensic Vision-Language Models
by: Lai, Yingxin, et al.
Published: (2026) -
Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection
by: Lai, Yingxin, et al.
Published: (2025) -
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
by: Ye, Qilang, et al.
Published: (2024) -
IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2025) -
GazeCLIP: Gaze-Guided CLIP with Adaptive-Enhanced Fine-Grained Language Prompt for Deepfake Attribution and Detection
by: Zhang, Yaning, et al.
Published: (2026)