Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Acharjee, Sanjay, Ratul, Abir Khan, Patino, Diego, Sakib, Md Nazmus
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.13970
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Training vision models to detect workplace hazards accurately requires realistic images of unsafe conditions that could lead to accidents. However, acquiring such datasets is difficult because capturing accident-triggering scenarios as they occur is nearly impossible. To overcome this limitation, this study presents a novel scene graph-guided generative AI framework that synthesizes photorealistic images of hazardous scenarios grounded in historical Occupational Safety and Health Administration (OSHA) accident reports. OSHA narratives are analyzed using GPT-4o to extract structured hazard reasoning, which is converted into object-level scene graphs capturing spatial and contextual relationships essential for understanding risk. These graphs guide a text-to-image diffusion model to generate compositionally accurate hazard scenes. To evaluate the realism and semantic fidelity of the generated data, a visual question answering (VQA) framework is introduced. Across four state-of-the-art generative models, the proposed VQA Graph Score outperforms CLIP and BLIP metrics based on entropy-based validation, confirming its higher discriminative sensitivity.

Similar Items