Electronic Thesis and Dissertation Repository

Thesis Format

Integrated Article

Degree

Doctor of Philosophy

Program

Computer Science

Supervisor

Mercer, Robert E.

Abstract

In the information explosion era, the ability to automatically extract knowledge and gain insights from diverse linguistic genres has become imperative. Comprehending intricate linguistic expressions constitutes an indispensable facet of artificial intelligence. Deep learning techniques have emerged as powerful tools for classification, relation extraction, semantic similarity measurement, and document summarization, offering the promise of revolutionizing our understanding of these crucial domains. In the dynamic landscape of Natural Language Processing (NLP), the integration of syntactic and semantic elements stands as a pivotal frontier. This investigation delves into incorporating both syntactic and semantic dimensions within NLP applications. By leveraging tree- and graph-based neural networks, this study pioneers a holistic approach that augments language understanding and processing capabilities. Through the fusion of structural and semantic-driven insights, this work tries to explore various NLP applications for two linguistic genres: scientific text, and psycho-linguistic texts. Scientific articles inherently embody a sophisticated framework of information representation, necessitating a depth of background knowledge for comprehension. This requisite background knowledge is gleaned through a meticulous examination of the citations interwoven within the ongoing paper. The objective of this endeavor is to scrutinize the citation linkage task, serving as an avenue for extracting the essential background information imperative for the meticulous analysis of scientific documents. Furthermore, for summarization, the citation network is leveraged to augment the performance of summarization models by furnishing additional contextual underpinnings. Different tree-structured neural networks are systematically explored to discern relations between various biomedical entities within scientific articles, thus contributing to the efficacy of relation extraction tasks. In the contemporary landscape dominated by the proliferation of social media, natural language processing emerges as a potent instrument for psychologists to delve into the analysis of individuals' personality traits. Conventional models, hampered by their incapacity to grapple with extended textual sequences exceeding their token intake limit, encounter limitations. This work propounds innovative solutions through the utilization of tree-structured neural networks and graph attention networks, facilitating the identification of personality traits from protracted written compositions.

Summary for Lay Audience

In today's age of information overload, it's crucial to automatically extract insights from diverse types of textual representaitons. This is especially important for artificial intelligence to comprehend complex linguistic expressions. Deep learning techniques, powerful tools for tasks like classification, relation extraction, and document summarization, hold the potential to revolutionize our understanding of these crucial domains.

Within the dynamic field of Natural Language Processing (NLP), integrating both syntactic and semantic elements is a key frontier. This research explores the combination of these dimensions using tree- and graph-based neural networks, offering a holistic approach to enhance language understanding and processing capabilities. The study focuses on two linguistic genres: scientific text and psycho-linguistic texts.

Scientific articles are intricate in their representation of information, requiring a depth of background knowledge for comprehension. This research meticulously examines citations within scientific papers to extract essential background information. The primary goal is to scrutinize citation linkages, providing necessary context for the detailed analysis of scientific documents. Additionally, the citation network is utilized to improve summarization models by adding contextual underpinnings along with a reflection of the research community's view. Various tree-structured neural networks are systematically explored to discern relations between biomedical entities within scientific articles, enhancing the efficacy of relation extraction tasks.

In today's world dominated by social media, natural language processing becomes a powerful tool for psychologists studying individuals' personality traits. Conventional models face limitations in handling lengthy textual sequences. This work introduces innovative solutions using tree-structured neural networks and graph attention networks to identify personality traits from extended written compositions. These approaches aim to overcome the challenges posed by the token intake limit of traditional models, providing new avenues for understanding and analyzing complex human expressions.

Share

COinS