Electronic Thesis and Dissertation Repository

Thesis Format



Master of Science


Computer Science


Mercer, Robert E.


The way in which research articles are cited reflects how previous work is utilized by other researchers or stakeholders and can indicate the impact of that work on subsequent experiments. Based on human intuition, citations can be perceived as positive, negative, or neutral. While current citation indexing systems provide information on the author and publication name of the cited article, as well as the citation count, they do not indicate the polarity of the citation. This study aims to identify the polarity of citations in scientific research articles using pre-trained language models like BERT, ELECTRA, RoBERTa, Bio-RoBERTa, SPECTER, ERNIE, LongFormer, BigBird, and deep-learning methods. Most citations have a neutral polarity, resulting in imbalanced datasets for training deep-learning models. To address this issue, a class balancing technique is proposed and applied to all datasets to improve consistency and results. Pre-trained language models are used to generate optimal features, and ensemble techniques are utilized to combine all model predictions to produce the highest precision, recall, and F1-scores for all three labels.

Summary for Lay Audience

While writing one research article, citations are used very often to mention the prominent works from earlier periods of time which have motivated the current work or showed very good performance while tackling the same problem the current paper is trying to solve. Now the intention of using the citation can be positive, negative, or neutral. Sometimes the readers need to read the referenced research work to grasp the ideas presented in the current paper. Knowing the citation intention will be very helpful for the readers as well before going through the referenced articles. The current citation indexing system provides a lot of information about the referenced article like names of the authors, publications, paper names, etc. However, the intention of using the citation is not possible to retrieve from this citation indexing system. That's why in this work I tried to develop a system that can capture the polarity of the citation used in the ongoing papers.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.