Electronic Thesis and Dissertation Repository

Thesis Format



Master of Science


Computer Science

Collaborative Specialization

Artificial Intelligence


Wang, Boyu


In domain adaptation, a model trained on one dataset (source domain) is applied to a different but related dataset (target domain). The most cutting-edge method is unsupervised source-free domain adaptation (SFDA), in which source data, source labels, and target labels are not available during adaptation. This thesis explores a realistic scenario where the target dataset includes some images that are unrelated to the adaptation process. This scenario can occur from errors in data collection or processing. We provide experiments and analysis to show that current state-of-the-art (SOTA) SFDA methods suffer significant performance drops under a specific domain adaptation setup when tested on our proposed scenario. To resolve this problem, we propose a new domain adaptation framework that integrates anomaly detection into the adaptation process which removes samples irrelevant to adaptation. Our framework greatly surpasses the performance of current SOTA methods, making it highly applicable in real-world uses of SFDA. In addition, future SFDA research should be benchmarked against our proposed scenario as a better measure of its real-world performance.

Summary for Lay Audience

The field of domain adaptation involves adapting a model trained on one domain (a dataset and its associated task) to perform well on a different but related domain. Using this method is useful when there are no labeled data in a target domain, or when a target domain differs significantly from the source domain. In this thesis, we propose and study the realistic scenario of domain adaptation where irrelevant samples are present during the process of adapting to the new domain. Through experimental results and analysis, we show that these samples significantly harm domain adaptation performance. Since many datasets used in industry are not completely free of irrelevant or inaccurate samples, resolving this performance gap is of utmost importance. By incorporating anomaly detection into domain adaptation, we propose a new framework for domain adaptation that combats this problem. The task of anomaly detection is to identify samples that differ from those in the training set. In comparison to multiple state-of-the-art domain adaptation methods, our framework outperforms them. Therefore, our framework should be utilized to ensure that real-world implementations of domain adaptation are successful. In addition, future studies should examine our proposed scenario as a new benchmark for domain adaptation.