Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article


Master of Science


Geography and Environment

Collaborative Specialization

Planetary Science and Exploration


Wang, Jinfei

2nd Supervisor

Barmby, Pauline



Galaxies have complex formations of components such as stars, dust, and gas, whose spatial and temporal relationships can help us to better understand the formation and evolution of galaxies, and ultimately the Universe. The main objective of this study is to test how machine learning can be used to classify galaxy components and stellar ages within spiral galaxies based on values of pixels in Hubble Space Telescope imagery, Euclidean distance calculations, textural features, and band ratios. We develop two machine learning models using maximum likelihood, random forest, and support vector machine algorithms. We find the models are successful for classification of galaxy components and stellar age, with Euclidean distance and textural features being the most important parameters. These methods can contribute to the rapid processing of high resolution astronomical imagery of galaxies and other celestial phenomena.

Summary for Lay Audience

The Universe is thought to have formed around 14 billion years ago, with our Milky Way galaxy forming soon after. The Milky Way and all other galaxies are made of components such as stars, dust, and gas. Different types of galaxies exhibit different patterns of components: elliptical galaxies are round in shape and host a large population of older stars, spiral galaxies are characterized by arms extending from them and have a higher population of young stars, and irregular galaxies lack patterns, exhibiting random distributions of young and old stars. Temperature and brightness determine the colour of stars we observe. Younger stars are hotter and brighter and appear bluer in colour, while older stars are colder and dimmer, appearing redder in colour. In this thesis, we use remote sensing techniques to observe galaxy components and the ages of stars within two spiral galaxies. Remote sensing can be defined as the gathering of information about different types of phenomena by distant observation; for example, by using satellites or telescopes. We take information gathered from light emitted by components within the spiral galaxies in the form of Hubble Space Telescope imagery. The Hubble Space Telescope is able to take images using filters that filter out different types of light (e.g., blue light) and focus on specific colours of light emitted from phenomena. We train a computer to automatically classify stellar age and galaxy component membership of each pixel in the Hubble Space Telescope images; this process is called machine learning. We use information stored within images to train the computer: Hubble Space Telescope images in several colours of light, distance of pixels from the spiral arms and galaxy center within the galaxies, patterns of spatial distribution of the galaxy components and stellar ages, and band ratios that compare the amounts of different colours of light emitted from the galaxies (e.g., blue light divided by green light). By observing the different ages of stars and the spatial relationships between the components with galaxies, we can better understand the formation and evolution of galaxies, the Universe, and ultimately how matter formed.