Electronic Thesis and Dissertation Repository

Investigating Citation Linkage as a Sentence Similarity Measurement Task using Deep Learning

Sudipta Singha Roy, The University of Western Ontario

Abstract

Research publications reflect advancements in the corresponding research domain. In these research publications, scientists often use citations to bolster the presented research findings and portray the improvements that come with these findings, at the same time, to make the contents more understandable to the audience by navigating the flow of information. In the science domain, a citation refers to the document from where this information originates but doesn't specify the text span that is actually being cited. A more precise reference would indicate the text being referenced. This thesis develops a framework which can create a linkage between the citing sentences from the ongoing research article and the related cited sentences from the corresponding referenced documents. This citation linkage problem has been modeled as a semantic relatedness task where given a citing sentence the framework pairs this citing sentence with each sentence from the reference document and then tries to determine which sentence pair is semantically similar and which pair is not. Construction of the citation linkage framework involves corpus creation and utilizing deep-learning models for semantic similarity measurement.