Electronic Thesis and Dissertation Repository

Thesis Format

Monograph

Degree

Doctor of Philosophy

Program

Electrical and Computer Engineering

Supervisor

Capretz, Miriam

2nd Supervisor

Grolinger, Katarina

Co-Supervisor

Abstract

Today, the amount of data collected is exploding at an unprecedented rate due to developments in Web technologies, social media, mobile and sensing devices and the internet of things (IoT). Data is gathered in every aspect of our lives: from financial information to smart home devices and everything in between. The driving force behind these extensive data collections is the promise of increased knowledge. Therefore, the potential of Big Data relies on our ability to extract value from these massive data sets. Machine learning is central to this quest because of its ability to learn from data and provide data-driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era and thus are based upon multiple assumptions that unfortunately no longer hold true in the context of Big Data.

This thesis presents the challenges associated with performing machine learning on Big Data and highlights the cause-effect relationship between the defining dimensions of Big Data and the applications of machine learning techniques. Additionally, emerging machine learning paradigms and how they can handle the challenges are identified. Although many areas of research and applications are affected by these challenges, this thesis focuses on tackling those associated with electrical load forecasting. Consequently, two of the identified challenges are addressed.

Firstly, an adaptation of the transformer architecture for electrical load forecasting is proposed in order to address the training time performance-related challenge associated with deep learning algorithms. The result showed improved accuracy for various forecasting horizons over the current state-of-the-art algorithm and addressed performance shortcomings through the architecture’s ability to be parallelized.

Secondly, a transfer learning algorithm is proposed to scale the learning of load forecasting tasks and effectively address the performance challenges associated with transfer learning. Additionally, the diversity of the data was examined to analyze the portability of the results. In spite of facing various data distributions, the learned concepts and results were repeatable over multiple streams. The results showed significant improvements to machine learning model training time, where the scaled models were 1.7 times faster on average leading to much more efficient model deployment times.

Summary for Lay Audience

Today, the amount of data collected is exploding at an unprecedented rate due to developments in Web technologies, social media, mobile and sensing devices and the internet of things (IoT). Data is gathered in every aspect of our lives: from financial information to smart home devices and everything in between. The driving force behind these extensive data collections is the promise of increased knowledge. Therefore, the potential of Big Data relies on our ability to extract value from these massive data sets. Machine learning is central to this quest because of its ability to learn from data. However, traditional machine learning approaches were developed in a different era and thus are facing a number of challenges.

This thesis presents those challenges associated with performing machine learning on Big Data. Although many research areas and applications are affected by these challenges, this thesis focuses on tackling those related to electrical load forecasting. Consequently, two of the identified challenges are addressed. Firstly, an adapted architecture for electrical load forecasting is proposed in order to address the training time performance-related challenge associated with deep learning algorithms. Secondly, a transfer learning algorithm is proposed to improve the learning time of load forecasting tasks and effectively address the performance challenges associated with transfer learning.

Share

COinS