Predicting Seismic Damage and Loss for Residential Buildings using Data Science
This thesis presents the application of data science techniques, especially machine learning, for the development of seismic damage and loss prediction models for residential buildings. Current post-earthquake building damage evaluation forms are developed for a particular country in mind. The lack of consistency hinders the comparison of building damage between different regions. A new paper form has been developed to address the need for a global universal methodology for post-earthquake building damage assessment. The form was successfully trialled in the street ‘La Morena’ in Mexico City following the 2017 Puebla earthquake. Aside from developing a framework for better input data for performance based earthquake engineering, this project also extended current techniques to derive insights from post-earthquake observations. Machine learning (ML) was applied to seismic damage data of residential buildings in Mexico City following the 2017 Puebla earthquake and in Christchurch following the 2010-2011 Canterbury earthquake sequence (CES). The experience showcased that it is readily possible to develop empirical data only driven models that can successfully identify key damage drivers and hidden underlying correlations without prior engineering knowledge. With adequate maintenance, such models have the potential to be rapidly and easily updated to allow improved damage and loss prediction accuracy and greater ability for models to be generalised. For ML models developed for the key events of the CES, the model trained using data from the 22 February 2011 event generalised the best for loss prediction. This is thought to be because of the large number of instances available for this event and the relatively limited class imbalance between the categories of the target attribute. For the CES, ML highlighted the importance of peak ground acceleration (PGA), building age, building size, liquefaction occurrence, and soil conditions as main factors which affected the losses in residential buildings in Christchurch. ML also highlighted the influence of liquefaction on the buildings losses related to the 22 February 2011 event. Further to the ML model development, the application of post-hoc methodologies was shown to be an effective way to derive insights for ML algorithms that are not intrinsically interpretable. Overall, these provide a basis for the development of ‘greybox’ ML models.