Electronic Thesis and Dissertation Repository

Selection Pressure on Surface Exposed Virus Proteins

Sareh Bagherichimeh, The University of Western Ontario

Abstract

Viral infection requires the interaction between virus surface-exposed (SE) proteins and host cell receptors. This can result in an “arms race” that is assumed to drive accelerated rates of evolution, and some well known examples of diversifying selection involve surface pro- teins (HIV-1 env, influenza hemagglutinin). We conducted a systematic analysis to determine whether this is truly a distinctive feature of SE virus proteins, in comparison to non-SE proteins encoded by the same genomes.

We obtained reference and all neighbour genomes of 52 human viruses from the NCBI Viral Genomes database. The coding sequences (CDS) of each genome extracted by pairwise alignment against the reference CDSs, and labeled as SE or non-SE using the Gene Ontology database and the transmembrane predictor TMbed. After generating a codon-aware multiple sequence alignments, we used FUBAR to estimate the joint probability distribution over 20 non-synonymous and synonymous rates for each alignment (the evolutionary fingerprint). We calculated the cosine distance between every pair of fingerprints and visualized the results using PCA.

In total, we analyzed 670 sets of homologous genes (125 of which were SE) from 21 virus families. We found no clear separation of SE from non-SE labels by PCA. Additionally, there were no significant differences between SE and non-SE genes in the codon site-specific mean dN/dS ratios, dN−dS differences, dN or dS independently, or the percentage of positive and/or negatively selected sites (Wilcoxon rank sum test, p < 0.05).

In closing, we did not find evidence that human virus genes encoding surface-exposed virus proteins undergo higher rates of adaptation than other protein-coding regions in the viral genome.