Electronic Thesis and Dissertation Repository

Thesis Format



Master of Science


Health and Rehabilitation Sciences


Zecevic, Aleksandra


Falls are the leading cause of injury-related hospitalizations among older adults in Canada. This study aimed to identify the most informative diagnostic categories associated with fall-related injuries (FRIs) using three machine learning algorithms: decision tree, random forest, and extreme gradient boosting tree (XGBoost). Secondary data from two Ontario health administrative databases (NACRS, DAD) covering the period 2006-2015 were analyzed. Older adults (aged ≥ 65 years) who sought treatment for FRIs in emergency departments (ED) or hospitals, as indicated by Canadian version of the 10th revision of the International Statistical Classification of Diseases and Related Health Problems (ICD-10-CA) codes for falls and injuries, were included in the study. Accuracy, sensitivity, specificity, precision, and F1 score measures were calculated for each model. A total of 631,339 ED admissions and 304,495 hospitalizations were recorded due to FRIs. The random forest model demonstrated the highest sensitivity and accuracy in both datasets. Dyspnea and secondary malignant neoplasm of liver and intrahepatic bile duct were the most informative ICD-10-CA code and disease for FRIs among older adults admitted to ED and hospitals. These findings indicate that machine learning models can also be used to study FRIs as they are capable of handling large datasets and providing a better than 60% accuracy. Also, diagnostic categories linked to FRIs have a potential to enhance healthcare providers ‘ability to prevent FRIs in the future.

Summary for Lay Audience

In Canada, falls are responsible for many emergency room visits and hospitalizations among older adults. This study explored a connection between injuries older adults experienced after a fall and other diagnoses they got while in an emergency room or a hospital. We used advanced computer calculations, also called machine learning, to determine which diagnostic categories are closely related with fall related injuries and provide the most useful information. Data from two large health databases (NACRS, DAD) covering the years 2006 to 2015 in Canadian province of Ontario were analyzed. Three machine learning algorithms were compared for accuracy and sensitivity. The results revealed that the random forest model was the most accurate and sensitive. One diagnostic code and diagnostic category were identified as informative: in the emergency department, the presence of dyspnea or shortness of breath was found to be a notable factor, and in hospitals the presence of an abnormal tumor in the bile duct and liver, also known as the secondary malignant neoplasm of liver and intrahepatic bile duct, were identified as highly relevant. These findings show that machine learning models can be used in studies about fall-related injuries (FRIs). These models can handle big amounts of data and have accuracy higher than 60%. The most informative diagnostic categories associated with FRIs can help healthcare providers better understand the risks of falls in older adults and improve their ability to prevent FRIs in the future.

Creative Commons License

Creative Commons Attribution-Share Alike 4.0 License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 License.