Master of Science
Epidemiology and Biostatistics
The area under the receiver operating characteristic curve (AUC) is commonly used to quantify the discriminative ability of tests with ordinal or continuous test data. When planning a study to evaluate a new test, it is important to determine a minimum sample size required to achieve a prespecified precision of estimating AUC. However, conventional sample size formulas do not consider the probability of achieving a prespecified precision, resulting in underestimation of sample sizes. To incorporate the assurance probability, asymptotic sample size formulas were derived using different variance estimators for AUC in this thesis. The precision of AUC estimations was quantified by either lower confidence limits or interval width. The performance of proposed sample size formulas was evaluated through simulation studies. Simulation results show that the formula based on lower limits with the nonparametric method performs best and can be used with both ordinal and continuous data. The methods are illustrated with examples from previously published data.
Summary for Lay Audience
The area under the receiver operating characteristic curve (AUC) is a tool used for describing the discriminative ability of diagnostic tests. Discriminative ability must be evaluated before adopting a test and using it in practice. An important factor to consider when planning an evaluation study is the minimum required sample size, as too small a sample size would make it difficult to see desired results, and too large a sample size may cause resources to be wasted. Typically, sample sizes are calculated using sample size formulas, however, existing sample size formulas tend to underestimate the required sample size because they do not consider the assurance probability of achieving a prespecified level of precision. In this thesis, we derived sample size formulas that incorporate this prespecified assurance probability. As sample size formulas require the variance of the AUC, we chose three different variance formulas to use. Simulation studies were conducted to evaluate the performance of sample size formulas. The results show that the formula based on lower limits with nonparametric method performed best and can be used with both ordinal and continuous data.
Lu, Grace, "Sample Size Formulas For Estimating Areas Under the Receiver Operating Characteristic Curves With Precision and Assurance" (2021). Electronic Thesis and Dissertation Repository. 8045.