Saved in:
| Main Authors: | Wang, Xuesong, Wang, Caisheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.08069 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An Improved Anomaly Detection Model for Automated Inspection of Power Line Insulators
by: Das, Laya, et al.
Published: (2023)
by: Das, Laya, et al.
Published: (2023)
Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
by: Wang, Jiahao, et al.
Published: (2024)
by: Wang, Jiahao, et al.
Published: (2024)
Integrating Artificial Intelligence Models and Synthetic Image Data for Enhanced Asset Inspection and Defect Identification
by: Mandati, Reddy, et al.
Published: (2024)
by: Mandati, Reddy, et al.
Published: (2024)
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
by: Zhang, Wenqi, et al.
Published: (2024)
by: Zhang, Wenqi, et al.
Published: (2024)
Seeing the Evidence, Missing the Answer: Tool-Guided Vision-Language Models on Visual Illusions
by: Wang, Xuesong, et al.
Published: (2026)
by: Wang, Xuesong, et al.
Published: (2026)
Argus Inspection: Do Multimodal Large Language Models Possess the Eye of Panoptes?
by: Yao, Yang, et al.
Published: (2025)
by: Yao, Yang, et al.
Published: (2025)
Model-Based Real-Time Pose and Sag Estimation of Overhead Power Lines Using LiDAR for Drone Inspection
by: Girard, Alexandre, et al.
Published: (2025)
by: Girard, Alexandre, et al.
Published: (2025)
MM-R1: Unleashing the Power of Unified Multimodal Large Language Models for Personalized Image Generation
by: Liang, Qian, et al.
Published: (2025)
by: Liang, Qian, et al.
Published: (2025)
Large Language Models for Multimodal Deformable Image Registration
by: Ma, Mingrui, et al.
Published: (2024)
by: Ma, Mingrui, et al.
Published: (2024)
synth-dacl: Does Synthetic Defect Data Enhance Segmentation Accuracy and Robustness for Real-World Bridge Inspections?
by: Flotzinger, Johannes, et al.
Published: (2025)
by: Flotzinger, Johannes, et al.
Published: (2025)
DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection
by: Song, Jaewoo, et al.
Published: (2025)
by: Song, Jaewoo, et al.
Published: (2025)
LLMGA: Multimodal Large Language Model based Generation Assistant
by: Xia, Bin, et al.
Published: (2023)
by: Xia, Bin, et al.
Published: (2023)
From Prediction to Diagnosis: Reasoning-Aware AI for Photovoltaic Defect Inspection
by: Mistry, Dev, et al.
Published: (2026)
by: Mistry, Dev, et al.
Published: (2026)
UniPCB: A Generation-Assisted Detection Framework for PCB Defect Inspection
by: Zhang, Huan, et al.
Published: (2026)
by: Zhang, Huan, et al.
Published: (2026)
An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation
by: Tan, Zhiyu, et al.
Published: (2024)
by: Tan, Zhiyu, et al.
Published: (2024)
ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models
by: Tan, Chuangchuang, et al.
Published: (2025)
by: Tan, Chuangchuang, et al.
Published: (2025)
AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks
by: Li, You, et al.
Published: (2024)
by: Li, You, et al.
Published: (2024)
Harnessing the Power of Large Vision Language Models for Synthetic Image Detection
by: Keita, Mamadou, et al.
Published: (2024)
by: Keita, Mamadou, et al.
Published: (2024)
Multimodal Large Language Models as Image Classifiers
by: Kisel, Nikita, et al.
Published: (2026)
by: Kisel, Nikita, et al.
Published: (2026)
VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model
by: Yang, Jinze, et al.
Published: (2024)
by: Yang, Jinze, et al.
Published: (2024)
Image Regeneration: Evaluating Text-to-Image Model via Generating Identical Image with Multimodal Large Language Models
by: Meng, Chutian, et al.
Published: (2024)
by: Meng, Chutian, et al.
Published: (2024)
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
by: Wen, Siwei, et al.
Published: (2025)
by: Wen, Siwei, et al.
Published: (2025)
Kosmos-G: Generating Images in Context with Multimodal Large Language Models
by: Pan, Xichen, et al.
Published: (2023)
by: Pan, Xichen, et al.
Published: (2023)
An Incremental Unified Framework for Small Defect Inspection
by: Tang, Jiaqi, et al.
Published: (2023)
by: Tang, Jiaqi, et al.
Published: (2023)
Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models
by: Zhang, Guosheng, et al.
Published: (2025)
by: Zhang, Guosheng, et al.
Published: (2025)
Transmission Line Defect Detection Based on UAV Patrol Images and Vision-language Pretraining
by: Zhang, Ke, et al.
Published: (2024)
by: Zhang, Ke, et al.
Published: (2024)
Fully-Synthetic Training for Visual Quality Inspection in Automotive Production
by: Huber, Christoph, et al.
Published: (2025)
by: Huber, Christoph, et al.
Published: (2025)
Guiding Instruction-based Image Editing via Multimodal Large Language Models
by: Fu, Tsu-Jui, et al.
Published: (2023)
by: Fu, Tsu-Jui, et al.
Published: (2023)
Safety of Multimodal Large Language Models on Images and Texts
by: Liu, Xin, et al.
Published: (2024)
by: Liu, Xin, et al.
Published: (2024)
Enhancing Power Grid Inspections with Machine Learning
by: Lavado, Diogo, et al.
Published: (2025)
by: Lavado, Diogo, et al.
Published: (2025)
MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios?
by: Dai, Shiqi, et al.
Published: (2025)
by: Dai, Shiqi, et al.
Published: (2025)
AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models
by: Gao, Yifei, et al.
Published: (2024)
by: Gao, Yifei, et al.
Published: (2024)
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation
by: Tian, Ye, et al.
Published: (2025)
by: Tian, Ye, et al.
Published: (2025)
ThinkFake: Reasoning in Multimodal Large Language Models for AI-Generated Image Detection
by: Huang, Tai-Ming, et al.
Published: (2025)
by: Huang, Tai-Ming, et al.
Published: (2025)
Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator
by: Zhao, Henry Hengyuan, et al.
Published: (2023)
by: Zhao, Henry Hengyuan, et al.
Published: (2023)
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
by: Li, Junxian, et al.
Published: (2024)
by: Li, Junxian, et al.
Published: (2024)
Grounding Everything in Tokens for Multimodal Large Language Models
by: Ren, Xiangxuan, et al.
Published: (2025)
by: Ren, Xiangxuan, et al.
Published: (2025)
Mirage: Unveiling Hidden Artifacts in Synthetic Images with Large Vision-Language Models
by: Sharma, Pranav, et al.
Published: (2025)
by: Sharma, Pranav, et al.
Published: (2025)
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model
by: Li, Mingxing, et al.
Published: (2025)
by: Li, Mingxing, et al.
Published: (2025)
Histopathology Image Report Generation by Vision Language Model with Multimodal In-Context Learning
by: Liu, Shih-Wen, et al.
Published: (2025)
by: Liu, Shih-Wen, et al.
Published: (2025)
Similar Items
-
An Improved Anomaly Detection Model for Automated Inspection of Power Line Insulators
by: Das, Laya, et al.
Published: (2023) -
Power-LLaVA: Large Language and Vision Assistant for Power Transmission Line Inspection
by: Wang, Jiahao, et al.
Published: (2024) -
Integrating Artificial Intelligence Models and Synthetic Image Data for Enhanced Asset Inspection and Defect Identification
by: Mandati, Reddy, et al.
Published: (2024) -
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
by: Zhang, Wenqi, et al.
Published: (2024) -
Seeing the Evidence, Missing the Answer: Tool-Guided Vision-Language Models on Visual Illusions
by: Wang, Xuesong, et al.
Published: (2026)