Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article

Degree

Master of Science

Program

Computer Science

Supervisor

Grace Y. Yi

Abstract

This research investigates the mortality risk of COVID-19 patients across different variant waves, using the data from Centers for Disease Control and Prevention (CDC) websites. By analyzing the available data, including patient medical records, vaccination rates, and hospital capacities, we aim to discern patterns and factors associated with COVID-19-related deaths.

To explore features linked to COVID-19 mortality, we employ different techniques such as Filter, Wrapper, and Embedded methods for feature selection. Furthermore, we apply various machine learning methods, including support vector machines, decision trees, random forests, logistic regression, K-nearest neighbours, na¨ıve Bayes methods, and artificial neural networks, to uncover underlying trends and correlations within the data.

The study identifies nine crucial factors significantly impacting patient survival in the context of COVID-19. These encompass patient-level factors, including pre-existing medical conditions, acute respiratory distress syndrome status, pneumonia status, age group category, headache status, and shortness of breath (dyspnea) status, as well as the three factors showing the patient’s status related to hospital aspects: hospitalization status, mechanical ventilation status, and intensive care unit admission status.

Utilizing these identified features, we further conduct a detailed statistical analysis using the logistic regression model to estimate the effects of these risk factors on COVID-19 mortality. The findings of this research indicate that the majority of those identified factors are statistically significant in influencing the likelihood of mortality. However, exceptions and variations are observed across different waves of COVID-19 variants, underscoring the dynamic nature of the pandemic. This study contributes insights into understanding the evolving landscape of COVID-19 outcomes.

Summary for Lay Audience

This research investigates the mortality risk of COVID-19 patients across different variant waves, using the data from Centers for Disease Control and Prevention (CDC) websites. By analyzing the available data, including patient medical records, vaccination rates, and hospital capacities, we aim to discern patterns and factors associated with COVID-19-related deaths. To explore features linked to COVID-19 mortality, we employ different techniques such as Filter, Wrapper, and Embedded methods for feature selection. Furthermore, we apply various machine learning methods, including support vector machines, decision trees, random forests, logistic regression, K-nearest neighbours, na¨ıve Bayes methods, and artificial neural networks, to uncover underlying trends and correlations within the data. The study identifies nine crucial factors significantly impacting patient survival in the context of COVID-19. These encompass patient-level factors, including pre-existing medical conditions, acute respiratory distress syndrome status, pneumonia status, age group category, headache status, and shortness of breath (dyspnea) status, as well as the three factors showing the patient’s status related to hospital aspects: hospitalization status, mechanical ventilation status, and intensive care unit admission status. Utilizing these identified features, we further conduct a detailed statistical analysis using the logistic regression model to estimate the effects of these risk factors on COVID-19 mortality. The findings of this research indicate that the majority of those identified factors are statistically significant in influencing the likelihood of mortality. However, exceptions and variations are observed across different waves of COVID-19 variants, underscoring the dynamic nature of the pandemic. This study contributes insights into understanding the evolving landscape of COVID-19 outcomes.

Share

COinS