
A Novel Method for Assessment of Batch Effect on single cell RNA sequencing data
Abstract
In recent years, technological developments have enabled the comprehensive transcriptional profiling of thousands of single cells in a single experiment. However, there is still much to be gained from the integration of datasets from different donors, studies, and technological platforms. One major challenge in this regard is the technical variability introduced by handling different batches, known as batch effects, which can obscure biological variations. Assessing batch effects within a dataset has been the focus of various studies seeking to establish reliable criteria for selecting a batch effect removal method. However, these methods do not always perform reliably. This study provides a comprehensive review of both batch effect removal and assessment methods and introduces a novel method for batch effect removal assessment. The performance of the proposed method is evaluated by comparing it to four other batch effect assessment methods using eleven test datasets. The results showed that the proposed method consistently outperformed the other methods, successfully passing all challenges while the other methods failed at least one test. The proposed method was applied to three biological integrated datasets to evaluate its performance on real-world data. The results of the evaluation showed that the proposed method demonstrated the highest correlation with the expert’s assessment of the datasets, indicating that it was able to accurately identify batch effects in the data.