:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gao, Shibo, Yang, Peipei, Guo, Haiyang, Liu, Yangyang, Chen, Yi, Li, Shuai, Zhu, Han, Xu, Jian, Zhang, Xu-Yao, Huang, Linlin
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.21649
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

VAGU & GtS: LLM-Based Benchmark and Framework for Joint Video Anomaly Grounding and Understanding
by: Gao, Shibo, et al.
Published: (2025)

CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
by: Xu, Jingwei, et al.
Published: (2024)

No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
by: Dai, Zunkai, et al.
Published: (2026)

Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
by: Zou, Yueying, et al.
Published: (2025)

ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification
by: Zhang, Congjing, et al.
Published: (2025)

Technical Report for ICML 2024 TiFA Workshop MLLM Attack Challenge: Suffix Injection and Projected Gradient Descent Can Easily Fool An MLLM
by: Guo, Yangyang, et al.
Published: (2024)

Rethinking Metrics and Benchmarks of Video Anomaly Detection
by: Liu, Zihao, et al.
Published: (2025)

BridgeNet: A Unified Multimodal Framework for Bridging 2D and 3D Industrial Anomaly Detection
by: Xiang, An, et al.
Published: (2025)

Absolute-Unified Multi-Class Anomaly Detection via Class-Agnostic Distribution Alignment
by: Guo, Jia, et al.
Published: (2024)

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data
by: Xu, Binqian, et al.
Published: (2024)

Text-Guided Multimodal Unified Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2026)

One Dinomaly2 Detect Them All: A Unified Framework for Full-Spectrum Unsupervised Anomaly Detection
by: Guo, Jia, et al.
Published: (2025)

UniADC: A Unified Framework for Anomaly Detection and Classification
by: Zhang, Ximiao, et al.
Published: (2025)

ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
by: Huang, Runhui, et al.
Published: (2025)

Place-it-R1: Unlocking Environment-aware Reasoning Potential of MLLM for Video Object Insertion
by: Gu, Bohai, et al.
Published: (2026)

HeadHunt-VAD: Hunting Robust Anomaly-Sensitive Heads in MLLM for Tuning-Free Video Anomaly Detection
by: Cai, Zhaolin, et al.
Published: (2025)

Language-guided Open-world Video Anomaly Detection under Weak Supervision
by: Liu, Zihao, et al.
Published: (2025)

Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
by: Zhu, Lanyun, et al.
Published: (2025)

Omni-Video 2: Scaling MLLM-Conditioned Diffusion for Unified Video Generation and Editing
by: Yang, Hao, et al.
Published: (2026)

GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly Detection
by: Zhang, Huaxin, et al.
Published: (2024)

PatchEAD: Unifying Industrial Visual Prompting Frameworks for Patch-Exclusive Anomaly Detection
by: Huang, Po-Han, et al.
Published: (2025)

A Unified Framework for Human-centric Point Cloud Video Understanding
by: Xu, Yiteng, et al.
Published: (2024)

A Lightweight 3D Anomaly Detection Method with Rotationally Invariant Features
by: Liang, Hanzhe, et al.
Published: (2025)

A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
by: Lin, Dongheng, et al.
Published: (2025)

Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM
by: Zhang, Huaxin, et al.
Published: (2024)

Video Motion Graphs
by: Liu, Haiyang, et al.
Published: (2025)

MLLM-CL: Continual Learning for Multimodal Large Language Models
by: Zhao, Hongbo, et al.
Published: (2025)

Elysium: Exploring Object-level Perception in Videos via MLLM
by: Wang, Han, et al.
Published: (2024)

UI-UG: A Unified MLLM for UI Understanding and Generation
by: Yang, Hao, et al.
Published: (2025)

AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
by: Chao, Yuhao, et al.
Published: (2025)

MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
by: Cai, Weitong, et al.
Published: (2024)

AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
by: Zhou, Qihang, et al.
Published: (2023)

IADGPT: Unified LVLM for Few-Shot Industrial Anomaly Detection, Localization, and Reasoning via In-Context Learning
by: Zhao, Mengyang, et al.
Published: (2025)

UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
by: Huang, Qing, et al.
Published: (2025)

VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection
by: Han, Hui, et al.
Published: (2026)

Sparse Reasoning is Enough: Biological-Inspired Framework for Video Anomaly Detection with Large Pre-trained Models
by: Huang, He, et al.
Published: (2025)

TokenCLIP: Token-wise Prompt Learning for Zero-shot Anomaly Detection
by: Zhou, Qihang, et al.
Published: (2025)

Efficient Motion-Aware Video MLLM
by: Zhao, Zijia, et al.
Published: (2025)

BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation
by: Wen, Haiquan, et al.
Published: (2025)

Deep Learning Technology for Face Forgery Detection: A Survey
by: Ma, Lixia, et al.
Published: (2024)