Master of Science
Dr. Lila Kari
Representing DNA sequences graphically and evaluating, as well as displaying, species’ relationships have been considered to be an important aspect of molecular biology research. A novel approach is proposed in this thesis that combines three methods: a) Chaos Game Representation (CGR), to portray quantitative characteristics of a DNA sequence as a black-and -white image, b) Structural Similarity (SSIM) index, an image comparison method, to compute pair-wise distances between these images, and c) Multidimensional Scaling (MDS), to visually display each sequence as a point in a two-dimensional Euclidean space. The proposed method produces a visual representation called Genome Distance Map (GDM) when applied to a collection of genomic DNA sequences. In a resulting Genome Distance Map, the sequences can be visualized as points in a common two-dimensional Euclidean space, wherein the geometric distance between any two points is approximate to the differences between their respective DNA sequence compositions. In addition, the proposed Genome Distance Map provides a compelling visualization of species’ relatedness in comparison to the phylogenetic trees. Moreover, the proposed method is sensitive and robust in detecting insertions, deletions, substitutions of nucleotides in a genome.
Sayem, Abu Sadat Md., "A quantitative method for measuring and visualizing species' relatedness in a two-dimensional Euclidean space." (2013). Electronic Thesis and Dissertation Repository. 1258.