Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article

Degree

Doctor of Philosophy

Program

Statistics and Actuarial Sciences

Supervisor

Douglas Woolford

Affiliation

The University of Western Ontario

2nd Supervisor

Charmaine B. Dean

Affiliation

University of Waterloo

Co-Supervisor

3rd Supervisor

W. John Braun

Affiliation

The University of British Columbia - Okanagan

Co-Supervisor

Abstract

This thesis develops and applies novel techniques for the study of complex data structures with applications to wildland fire analytics and sports analytics. We consider situations where different models share information, including many different variables recorded simultaneously in aerial wildland fire fighting, how the frequency and severity of wildland fires are related, and how the shot locations of hockey players can be decomposed into spatial components that are shared across different players.

The first study analyzes flight patterns while fighting a wildland fire using several outlier detection techniques. These techniques applied several definitions of ``outlier'' to determine whether or not the pilot did something different while dropping water on a fire. To aid in fire management, we developed a tool to display the outliers in a way that is meaningful to the managers. Our second set of studies analyzes the association between ignitions of wildland fires (temporally and spatially, respectively) and the total area burned by those fires. We used a modelling approach that allows two models to share information in order to get a better estimate of the underlying process and elucidate underlying relationships. The first version of this analysis models the number of fires per day, and the second version models the number of fires in any arbitrary region. The association between ignition probability and size is not always positive; in some time periods or regions there is a negative association indicating that high ignition probability is associated with small fires. Furthermore, our techniques may suggest a change in our ability to detect and measure fires. Our final study moves away from wildland fire and into sports analytics. We employ the spatial techniques used in the previous analysis to characterize shot locations in the National Hockey League (NHL). These techniques were augmented with image recognition algorithms to summarize the spatial distribution of shots as a collection of coefficients for a collection of estimated basis functions. A definition of shot quality was created based on the coefficients for the spatial distribution of goals. Each of the papers contained in this thesis are applications of spatial, temporal, or spatio-temporal techniques to multivariate data with a new methodological extension. The aerial fire fighting paper provides visualizations to summarize a variety of statistical process control techniques; the wildland fire ignition papers provide joint modelling approaches for temporal and spatial point processes, respectively; and the hockey paper summarizes the similarities between a large number of spatial point processes.

Summary for Lay Audience

This thesis develops and applies novel data science techniques for the study of complex data structures with applications to wildland fire analytics and sports analytics. Broadly, we look at situations where several different models share information. Many different flight variables are analyzed in the pilot study, wildfire ignitions are analyzed along with the fire sizes, and NHL players are all modelled together.

First, we use descriptive statistics and methods from quality control to detect when a pilot of a water bomber attacking a wildland fire does something unusual while flying. The pilots in our study repeatedly dropped water on a fire, and we suspect that each drop should look approximately the same. Unexpected things can happen while flying, so we do not claim that our method detected errors. Instead, we found which drops had something unusual happen and provided a way for the managers to inspect the unusual features in the flight. We then move on to study the fires themselves. We consider either the number of fires per day or the number of fires in a given spatial region. Both models are fit along with the total burn area of the fires. By modelling both of these outcomes at the same time we can find patterns of association between the ignition probability and the size of the fires. We can determine times or areas that have positive association, meaning that high ignition rate was associated with larger fires. However, we also sometimes found the reverse of this: higher ignition probability was associated with smaller fires, or vice versa. We also found evidence that our ability to detect and measure fires has improved over time. Our final study moves away from wildland fire and into sports analytics in a spatial context. We use the same spatial techniques as the ignition probability model, but applied it to shots in National Hockey League games. A different spatial estimate was found for each player, then the most common shooting locations were found using an image recognition technique. The image recognition technique includes a set of numbers that describe how much each player shoots from each of the common shooting locations, and this set of numbers can give a concise summary of the similarity between players. By looking at the number of goals from each location, we can provide a measure of shot quality.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS