Thesis Format

Monograph

Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification

Xiaoou LiFollow

Degree

Master of Science

Program

Computer Science

Collaborative Specialization

Biostatistics

Supervisor

Kaizhong Zhang

Abstract

In recent years, the technology of glycopeptide sequencing through MS/MS mass spectrometry data has achieved remarkable progress. Various software tools have been developed and widely used for protein identification. Estimation of false discovery rate (FDR) has become an essential method for evaluating the performance of glycopeptide scoring algorithms. The target-decoy strategy, which involves constructing decoy databases, is currently the most popular utilized method for FDR calculation. In this study, we applied various decoy construction algorithms to generate decoy glycan databases and proposed a novel approach to calculate the FDR by using the EM algorithm and mixture model.

Summary for Lay Audience

In recent years, an increasing number of glycopeptide identification software has been developed, capable of scoring glycopeptides and identifying tandem mass spectrometry data. However, due to the potential mistakes in the results, false discovery rate (FDR) estimation plays a key role in evaluating the confidence of correctness. Applying the decoy-target approach is one of the most effective methods for calculating FDR, which requires building a decoy database. In this study, we explored a novel method for generating decoy databases based on the probability of glycan composition in the target database, and then compared it with other decoy construction methods. Meanwhile, since the distribution of target matches could be a mixture of the correct matches and incorrect matches, we created a new FDR estimation approach by using the EM algorithm with a mixture model.

Recommended Citation

Li, Xiaoou, "Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification" (2023). Electronic Thesis and Dissertation Repository. 9581.
https://ir.lib.uwo.ca/etd/9581

Download

Included in

Bioinformatics Commons, Theory and Algorithms Commons

COinS

Thesis Format

Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification

Degree

Program

Collaborative Specialization

Supervisor

Abstract

Summary for Lay Audience

Recommended Citation

Included in

Links

Browse

Author Corner

Links

Thesis Format

Decoy-Target Database Strategy and False Discovery Rate Analysis for Glycan Identification

Author

Degree

Program

Collaborative Specialization

Supervisor

Abstract

Summary for Lay Audience

Recommended Citation

Included in

Share

Links

Browse

Author Corner

Links