Electronic Thesis and Dissertation Repository

Degree

Master of Engineering Science

Program

Electrical and Computer Engineering

Supervisor

Dr. Miriam A. M. Capretz

Abstract

More and more data are becoming part of people's lives. With the popularization of technologies like sensors, and the Internet of Things, data gathering is becoming possible and accessible for users. With these data in hand, users should be able to extract insights from them, and they want results as soon as possible. Average users have little or no experience in data analytics and machine learning and are not great observers who can collect enough data to build their own machine learning models. With large quantities of similar data being generated around the world and many machine learning models being used, it should be possible to use additional data and existing models to create accurate machine learning models for these users.

This thesis proposes Agora, a Web-based marketplace where users can share their data and machine learning models with other users with small datasets and little experience. This thesis includes an overview of all the components that make up Agora, as well as details of two of its main components: Hephaestus and Sibyl.

Hephaestus is a domain adaptation method for multi-feature regression models with seasonal adjustment, which can improve predictions for small datasets using information from additional datasets. Hephaestus works in the pre- and post- processing phases, making it possible to work with any standard machine learning algorithm. As a case study, we built predictive models using the proposed method to predict school energy consumption with only one month of data, improving accuracy to the same level as if 12 months of data were being used.

Sibyl is a flexible, scalable and non-blocking machine learning as a service, which facilitates the creation of multiple predictive models and running them at the same time. As a case study, we implemented Sibyl equipped with three machine learning algorithms to show the flexibility of adding new algorithms. We also executed three models at the same time to demonstrate that they can run without interference from another model.

The results obtained in this research demonstrates the concept of Agora. Users can share the same platform to provide or consume knowledge and create multiple concurrent machine learning models.