Accidents analysis of mining industry through semantic text representation and dimensionality reduction: An integrated clustering framework

Document Type

Article

Publication Date

12-1-2025

Department

Department of Geological and Mining Engineering and Sciences

Abstract

Despite significant improvements in worker health and safety in recent decades, the mining industry still experiences fatal and non-fatal accidents. This underscores the critical need for a more nuanced understanding of accident causation patterns through accident data analysis. While conventional analytical approaches have yielded valuable insights, the extensive information embedded within text-based accident narratives remains underutilized. To address this gap, this study presents an artificial intelligence (AI)-based framework that integrates transformer-based natural language processing, nonlinear dimensionality reduction, and unsupervised machine learning to analyze and cluster accident narratives from the U.S. mining industry. Specifically, this study uses Sentence-BERT (SBERT), a sentence embedding model based on Bidirectional Encoder Representations from Transformers (BERT), to extract the high-dimensional semantical representation of accident narratives. These embeddings are then mapped to a low-dimensional space using the Uniform Manifold Approximation and Projection (UMAP) technique, followed by clustering with the k-means algorithm and subsequent hazard-focused cluster analysis. The primary contribution to AI lies in demonstrating the effectiveness of combining modern sentence embedding techniques with dimensionality reduction and clustering for the semantic analysis of safety-related narratives. From an engineering standpoint, this framework enables the identification of latent accident patterns that can inform hazard detection and guide safety interventions in the mining industry. The resulting clusters reveal diverse accident patterns across mining operations. In clusters associated with underground mining, a high proportion of incidents (ranging from 84 to 98 %) involved no equipment, with distinct injury patterns: torso injuries (67 %) from over-exertion, lower extremity injuries (58 %) from slips/falls, and upper extremity injuries from over-exertion (95 %) and material handling (85 %). Equipment-related clusters revealed strong associations with drilling tools (92 %), loaders (98 %), and bolting equipment (96 %). Clusters associated with strip/quarry/open pit operations exhibited a high frequency of vehicle-related accidents (98 % transportation, 99 % loaders), often resulting in multiple body part injuries. Milling operation clusters show 52–97 % no-equipment accidents, with injury patterns similar to those in underground mining. Additionally, noise-induced hearing loss (96–97 %) was observed in clusters spanning all mining operation types. These findings offer actionable insights for safety professionals and support data-driven, targeted interventions in the mining industry.

Publication Title

Engineering Applications of Artificial Intelligence

Share

COinS