Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article


Master of Science


Computer Science


Ling, Charles X.


The increased pressure of publications makes it more and more difficult for researchers to find appropriate papers to cite quickly and accurately. Context-aware citation recommendation, which can provide users suggestions mainly based on local citation contexts, has been shown to be helpful to alleviate this problem. However, previous works mainly use RNN models and their variance, which tend to be highly complicated with heavy-weight computation. In this paper, we propose a lightweight and explainable model that is quick to train and obtains high performance. Our model is based on a pre-trained sentence embedding model and trained with triplet loss. Quantitative results on the benchmark dataset reveal that our model achieves impressive performance with or without metadata. Qualitative evidence shows that our model pays different levels of attention to adequate parts of citation contexts and metadata, suggesting that our method is explainable and more trustable.

Summary for Lay Audience

In recent years, natural language processing has witnessed tremendous breakthroughs in different research problems such as machine translation and word processor. These improvements have dramatically changed human’s lifestyles and make our life much easier. However, there are still many areas that need to be explored. The citation recommendation, especially the local citation recommendation, is just one of these areas. Considering the increased number of publications in recent years. It becomes more and more harder for researchers to find appropriate papers to cite nowadays. The local citation recommendation is aimed at solving this problem. The local citation recommendation just imitates human’s way of thinking. Given the citation contexts of several sentences, the local citation recommendation can provide the user possible papers to be cited. Then the user can choose from these papers, which dramatically reduce the user’s workload. In this thesis, we focus on the local citation recommendation problem. We propose an innovative method based on a pre-trained sentence encoder. Our method outperforms the baselines in all metrics.