Reliable LLM-driven BIM automation through capability-based multi-dimensional evaluation

Document Type

Article

Publication Date

7-2026

Department

Department of Civil, Environmental, and Geospatial Engineering; Department of Cognitive and Learning Sciences

Abstract

Despite the wide adoption of Building Information Modeling (BIM), the modeling process remains labor-intensive. Large Language Models (LLMs) offer the potential to automate BIM by translating natural language instructions into executable modeling operations. However, their reliability in BIM automation has not been systematically evaluated. This paper thus proposes a capability-based multi-dimensional evaluation framework. Thirty-one tasks spanning component-, assembly-, and system-level modeling were evaluated. 238 errors were recorded and classified into a ten-category taxonomy, with API misuse accounting for 48.3% of failures. LLM Reliability was then assessed using a Multi-Attribute Decision-Making framework covering initial robustness, correction efficiency, error diversity, and outcome fidelity. Reliability scores were 0.603 for component-level, 0.505 for assembly-level, and 0.489 for system-level. A benchmark against manual BIM scripting indicates 46% reduction in completion time but 32% more correction iterations. This study contributes a capability-driven evaluation framework and an error taxonomy that support reliable LLM-driven BIM automation.

Publication Title

Automation in Construction

Share

COinS