Significance of variable selection and scaling issues for probabilistic modeling of rainfall-induced landslide susceptibility

Document Type


Publication Date



Department of Mathematical Sciences; Center for Data Sciences


Identifying the input variables/attributes for probabilistic modeling of rainfall-induced landslides is critical for effective landslide susceptibility characterization. This study evaluates the capabilities of different attribute selectors available in Weka, an open source machine learning software, for identifying the most landslide-predictive combination of attributes. The study area is located in the Lake Atitlán watershed in Guatemala, which is highly susceptible to landslides during the rainy season. Landslide initiation points were delineated in the field as well as in the ortho-photos, which were taken following Hurricane Stan of October 2005. Two datasets spanning different sized areas were used to compare the success of attribute selectors and to determine if the model results from the smaller area could be successfully applied to the larger one. The Weka Bayesian network classification algorithm was used to evaluate the success of different attribute selection methods and to identify the combination of attributes with the highest rate of landslide prediction for the two study areas. Filtered subset proved to be the most successful in identifying the ideal combination for both the smaller and the larger scale datasets.

Publisher's Statement

© Korean Spatial Information Society 2017. Pulisher's version of record:

Publication Title

Spatial Information Research