Thesis Format
Monograph
Degree
Master of Engineering Science
Program
Electrical and Computer Engineering
Supervisor
Capretz, Miriam A. M.
Abstract
Spatiotemporal forecasting can be described as predicting the future value of a variable given when and where it will happen. This type of forecasting task has the potential to aid many institutions and businesses in asking questions, such as how many people will visit a given hospital in the next hour. Answers to these questions have the potential to spur significant socioeconomic impact, providing privacy-friendly short-term forecasts about geolocated events, which in turn can help entities to plan and operate more efficiently. These seemingly simple questions, however, present complex challenges to forecasting systems. With more GPS-enabled devices connected every year, from smartphones to wearables to IoT devices, the volume of collected spatiotemporal data that accompanies these questions has exploded, following the Big Data trend. This thesis proposes a forecasting framework that employs distributed computing in order to scale its internal components and overcome this high data volume scenario. It also designs discretization components that allow for flexibility in the framing of the forecasting questions. Furthermore, it devises a Geographically Global Model (GGM) backed by an ensemble of Stochastic Gradient Boosted Trees, a collection of Geographically Local Models (GLMs) backed by ARIMA models, and a non-linear blending of those as part of its multistage machine learning pipeline in order to boost its performance and stability. The merit of the proposed research is evaluated in three experiments, each of which comprises millions of records, namely forecasting hourly taxi demand in the city of New York, forecasting daily crime density in the city of Chicago, and forecasting hourly visits to places of interest across Canada. The experimental results show the effectiveness of the proposed Spatiotemporal Forecasting Framework in forecasting stable results across the three domains, while also outperforming the naive baseline by at least 49.8% with respect to the SMAPE residuals.
Summary for Lay Audience
Spatiotemporal forecasting can be described as predicting the future value of a variable given when and where it will happen. This type of forecasting task has the potential to aid many institutions and businesses in asking questions, such as how many people will visit a given hospital in the next hour. Answers to these questions have the potential to spur significant socioeconomic impact, providing privacy-friendly short-term forecasts about geolocated events, which in turn can help entities to plan and operate more efficiently. These seemingly simple questions, however, present complex challenges to forecasting systems. With more GPS-enabled devices connected every year, from smartphones to wearables to IoT devices, the volume of collected spatiotemporal data that accompanies these questions has exploded, following the Big Data trend. This thesis proposes a forecasting framework that employs distributed computing in order to scale its internal components and overcome this high data volume scenario. It also designs discretization components that allow for flexibility in the framing of the forecasting questions. The merit of the proposed research is evaluated in three experiments, each of which comprises millions of records, namely forecasting hourly taxi demand in the city of New York, forecasting daily crime density in the city of Chicago, and forecasting hourly visits to places of interest across Canada. The experimental results show the effectiveness of the proposed Spatiotemporal Forecasting Framework in forecasting stable results across the three domains, while also outperforming the naive baseline by at least 49.8%.
Recommended Citation
Nascimento de Aguiar, Rafael Felipe, "Spatiotemporal Forecasting At Scale" (2019). Electronic Thesis and Dissertation Repository. 6316.
https://ir.lib.uwo.ca/etd/6316
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.