Saved in:
| Main Authors: | Gao, Shibo, Yang, Peipei, Guo, Haiyang, Liu, Yangyang, Chen, Yi, Li, Shuai, Zhu, Han, Xu, Jian, Zhang, Xu-Yao, Huang, Linlin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.21649 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VAGU & GtS: LLM-Based Benchmark and Framework for Joint Video Anomaly Grounding and Understanding
by: Gao, Shibo, et al.
Published: (2025)
by: Gao, Shibo, et al.
Published: (2025)
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
by: Xu, Jingwei, et al.
Published: (2024)
by: Xu, Jingwei, et al.
Published: (2024)
No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
by: Dai, Zunkai, et al.
Published: (2026)
by: Dai, Zunkai, et al.
Published: (2026)
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
by: Zou, Yueying, et al.
Published: (2025)
by: Zou, Yueying, et al.
Published: (2025)
ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification
by: Zhang, Congjing, et al.
Published: (2025)
by: Zhang, Congjing, et al.
Published: (2025)
Technical Report for ICML 2024 TiFA Workshop MLLM Attack Challenge: Suffix Injection and Projected Gradient Descent Can Easily Fool An MLLM
by: Guo, Yangyang, et al.
Published: (2024)
by: Guo, Yangyang, et al.
Published: (2024)
Rethinking Metrics and Benchmarks of Video Anomaly Detection
by: Liu, Zihao, et al.
Published: (2025)
by: Liu, Zihao, et al.
Published: (2025)
BridgeNet: A Unified Multimodal Framework for Bridging 2D and 3D Industrial Anomaly Detection
by: Xiang, An, et al.
Published: (2025)
by: Xiang, An, et al.
Published: (2025)
Absolute-Unified Multi-Class Anomaly Detection via Class-Agnostic Distribution Alignment
by: Guo, Jia, et al.
Published: (2024)
by: Guo, Jia, et al.
Published: (2024)
FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data
by: Xu, Binqian, et al.
Published: (2024)
by: Xu, Binqian, et al.
Published: (2024)
Text-Guided Multimodal Unified Industrial Anomaly Detection
by: Li, Zewen, et al.
Published: (2026)
by: Li, Zewen, et al.
Published: (2026)
One Dinomaly2 Detect Them All: A Unified Framework for Full-Spectrum Unsupervised Anomaly Detection
by: Guo, Jia, et al.
Published: (2025)
by: Guo, Jia, et al.
Published: (2025)
UniADC: A Unified Framework for Anomaly Detection and Classification
by: Zhang, Ximiao, et al.
Published: (2025)
by: Zhang, Ximiao, et al.
Published: (2025)
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
by: Huang, Runhui, et al.
Published: (2025)
by: Huang, Runhui, et al.
Published: (2025)
Place-it-R1: Unlocking Environment-aware Reasoning Potential of MLLM for Video Object Insertion
by: Gu, Bohai, et al.
Published: (2026)
by: Gu, Bohai, et al.
Published: (2026)
HeadHunt-VAD: Hunting Robust Anomaly-Sensitive Heads in MLLM for Tuning-Free Video Anomaly Detection
by: Cai, Zhaolin, et al.
Published: (2025)
by: Cai, Zhaolin, et al.
Published: (2025)
Language-guided Open-world Video Anomaly Detection under Weak Supervision
by: Liu, Zihao, et al.
Published: (2025)
by: Liu, Zihao, et al.
Published: (2025)
Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
by: Zhu, Lanyun, et al.
Published: (2025)
by: Zhu, Lanyun, et al.
Published: (2025)
Omni-Video 2: Scaling MLLM-Conditioned Diffusion for Unified Video Generation and Editing
by: Yang, Hao, et al.
Published: (2026)
by: Yang, Hao, et al.
Published: (2026)
GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly Detection
by: Zhang, Huaxin, et al.
Published: (2024)
by: Zhang, Huaxin, et al.
Published: (2024)
PatchEAD: Unifying Industrial Visual Prompting Frameworks for Patch-Exclusive Anomaly Detection
by: Huang, Po-Han, et al.
Published: (2025)
by: Huang, Po-Han, et al.
Published: (2025)
A Unified Framework for Human-centric Point Cloud Video Understanding
by: Xu, Yiteng, et al.
Published: (2024)
by: Xu, Yiteng, et al.
Published: (2024)
A Lightweight 3D Anomaly Detection Method with Rotationally Invariant Features
by: Liang, Hanzhe, et al.
Published: (2025)
by: Liang, Hanzhe, et al.
Published: (2025)
A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
by: Lin, Dongheng, et al.
Published: (2025)
by: Lin, Dongheng, et al.
Published: (2025)
Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM
by: Zhang, Huaxin, et al.
Published: (2024)
by: Zhang, Huaxin, et al.
Published: (2024)
Video Motion Graphs
by: Liu, Haiyang, et al.
Published: (2025)
by: Liu, Haiyang, et al.
Published: (2025)
MLLM-CL: Continual Learning for Multimodal Large Language Models
by: Zhao, Hongbo, et al.
Published: (2025)
by: Zhao, Hongbo, et al.
Published: (2025)
Elysium: Exploring Object-level Perception in Videos via MLLM
by: Wang, Han, et al.
Published: (2024)
by: Wang, Han, et al.
Published: (2024)
UI-UG: A Unified MLLM for UI Understanding and Generation
by: Yang, Hao, et al.
Published: (2025)
by: Yang, Hao, et al.
Published: (2025)
AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
by: Chao, Yuhao, et al.
Published: (2025)
by: Chao, Yuhao, et al.
Published: (2025)
MLLM as Video Narrator: Mitigating Modality Imbalance in Video Moment Retrieval
by: Cai, Weitong, et al.
Published: (2024)
by: Cai, Weitong, et al.
Published: (2024)
AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
by: Zhou, Qihang, et al.
Published: (2023)
by: Zhou, Qihang, et al.
Published: (2023)
IADGPT: Unified LVLM for Few-Shot Industrial Anomaly Detection, Localization, and Reasoning via In-Context Learning
by: Zhao, Mengyang, et al.
Published: (2025)
by: Zhao, Mengyang, et al.
Published: (2025)
UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
by: Huang, Qing, et al.
Published: (2025)
by: Huang, Qing, et al.
Published: (2025)
VRAG-DFD: Verifiable Retrieval-Augmentation for MLLM-based Deepfake Detection
by: Han, Hui, et al.
Published: (2026)
by: Han, Hui, et al.
Published: (2026)
Sparse Reasoning is Enough: Biological-Inspired Framework for Video Anomaly Detection with Large Pre-trained Models
by: Huang, He, et al.
Published: (2025)
by: Huang, He, et al.
Published: (2025)
TokenCLIP: Token-wise Prompt Learning for Zero-shot Anomaly Detection
by: Zhou, Qihang, et al.
Published: (2025)
by: Zhou, Qihang, et al.
Published: (2025)
Efficient Motion-Aware Video MLLM
by: Zhao, Zijia, et al.
Published: (2025)
by: Zhao, Zijia, et al.
Published: (2025)
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and Explanation
by: Wen, Haiquan, et al.
Published: (2025)
by: Wen, Haiquan, et al.
Published: (2025)
Deep Learning Technology for Face Forgery Detection: A Survey
by: Ma, Lixia, et al.
Published: (2024)
by: Ma, Lixia, et al.
Published: (2024)
Similar Items
-
VAGU & GtS: LLM-Based Benchmark and Framework for Joint Video Anomaly Grounding and Understanding
by: Gao, Shibo, et al.
Published: (2025) -
CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM
by: Xu, Jingwei, et al.
Published: (2024) -
No Need For Real Anomaly: MLLM Empowered Zero-Shot Video Anomaly Detection
by: Dai, Zunkai, et al.
Published: (2026) -
Survey on AI-Generated Media Detection: From Non-MLLM to MLLM
by: Zou, Yueying, et al.
Published: (2025) -
ALARM: Automated MLLM-Based Anomaly Detection in Complex-EnviRonment Monitoring with Uncertainty Quantification
by: Zhang, Congjing, et al.
Published: (2025)