Proactive safety hazard identification using visual–text semantic similarity for construction safety management
Document Type
Article
Publication Date
10-2024
Department
Department of Civil, Environmental, and Geospatial Engineering
Abstract
Automated safety management in construction can reduce injuries by identifying hazardous postures, actions, and missing personal protective equipment (PPE). However, existing computer vision (CV) methods have limitations in connecting recognition results to text-based safety rules. To address this issue, this paper presents a multi-modal framework that bridges the gap between construction image monitoring and safety knowledge. The framework includes an image processing module that utilizes CV and dense image captioning techniques, and a text processing module that employs natural language processing for semantic similarity evaluation. Experiments showed a mean average precision of 49.6% in dense captioning and an F1 score of 74.3% in hazard identification. While the proposed framework demonstrates a promising multi-modal approach towards automated safety hazard identification and reasoning, improvements in dataset size and model performance are still needed to enhance its effectiveness in real-world applications.
Publication Title
Automation in Construction
Recommended Citation
Wang, Y.,
Xiao, B.,
Bouferguene, A.,
&
Al-Hussein, M.
(2024).
Proactive safety hazard identification using visual–text semantic similarity for construction safety management.
Automation in Construction,
166.
http://doi.org/10.1016/j.autcon.2024.105602
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/906