Machine learning for the diagnosis of pulmonary hypertension

Document Type


Publication Date



College of Computing


Objective This paper aims to investigate whether machine learning (ML) can be used to predict the state of pulmonary hypertension (PH), including pre-capillary and post-capillary, from echocardiographic data.Methods Two hundred and seventy-five patients with PH who underwent both echocardiography and right heart catheterization were included in the study. Mean pulmonary artery pressure, pulmonary artery wedge pressure measured by right heart catheterization were used as criteria for judging pre-capillary PH and post-capillary PH. Thirteen echocardiographic indicators were used to predict whether the PH was pre-capillary or post-capillary. Nine ML models were used to make predictions. Accuracy was used as the primary reference standard, and the performance of classification model is observed in conjunction with area under curve (AUC), specificity (Sp), sensitivity (Se), Positive Prediction Value (PPV), Negative Prediction Value (NPV), Positive Likelihood Ratio (PLR) and Negative Likelihood Ratio (NLR) and other assessment protocols.Results By comparing the accuracy (ACC), recall rate (Recall) and other model effect evaluation index of the classification under the nine ML models, it can be found that the ML model can effectively identify the pre-capillary PH and the post-capillary PH. LogitBoost performed best in nine ML models (ACC=0.87, Recall=0.83, F1score=0.85, AUC=0.87, Se=0.90, NPV=0.88, PPV=0.87, PLR=8.61 and NLR=0.18, AUC=0.83), it showed good results in identification of the pre-capillary PH (ACC=0.83, Recall=0.87, F-score=0.85); Post-vascular PH (ACC=0.90, Recall=0.88, F-score=0.89). Decision Tree (ACC=0.75, Recall=0.77, F1score=0.78, AUC=0.75, Se=0.72, NPV=0.78, PPV=0.77, PLR=3.66 and NLR=0.29, AUC=0.79) performed worst, and the accuracy of the other seven models was greater than 0.82.Conclusion The classification results of the nine ML models in this paper indicate that the ML method can effectively identify the pre-capillary PH and post-capillary PH from echocardiographic data. Compared with medical diagnosis, ML methods can distinguish between pre-capillary PH and the post-capillary PH under non-invasive conditions.

Publication Title