Doctor of Philosophy
Mercer, Robert E.
Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Many Natural Language applications are powered by machine learning models performing a large variety of underlying tasks. Recently, deep learning approaches have obtained very high performance across many NLP tasks. In order to achieve this high level of performance, it is crucial for computers to have an appropriate representation of sentences. The tasks addressed in the thesis are best approached having shallow semantic representations. These representations are vectors that are then embedded in a semantic space. We present a variety of novel approaches in deep learning applied to NLP for generating effective sentence representations in this space. These semantic representations can either be general or task-specific. We focus on learning task-specific sentence representations, where often these tasks have a good amount of overlap. We design a set of general purpose and task specific sentence encoders combining both word-level semantic knowledge and word- and sentence-level syntactic information. As a method for the former, we perform an intelligent amalgamation of word vectors using modern deep learning modules. For the latter, we use word-level knowledge, such as parts of speech, spelling, and suffix features, and sentence-level information drawn from natural language parse trees which provide the hierarchical structure of a sentence together with grammatical relations between the words. Further expertise is added with reinforcement learning which guides a machine learning model through a reward-penalty game. Rather than just striving for good performance, we always try to design models that are more transparent and explainable. We provide an intuitive explanation about the design of each model and how the model is making a decision. Our extensive experiments show that these models achieve competitive performance compared with the currently available state-of-the-art generalized and task-specific sentence encoders. All but one of the tasks dealt with English language texts. The multilingual semantic similarity task required creating a multilingual corpus for which we provide a novel semi-supervised approach to make artificial negative samples in the presence of just positive samples.
Summary for Lay Audience
The goal of Computational Linguistics is to analyze and process human language automatically by computers. To help achieve this goal, Natural Language Processing (NLP) based models, actualized by Artificial Intelligence (AI) algorithms, are being incorporated in increasingly intelligent computer applications at a rapid pace. These NLP models are being used in the language related aspects of publishing, healthcare, banking, advertising and insurance industries to improve their customer services and enterprise activities. Certain NLP tasks are fundamental to this thesis: paraphrase identification, sentence similarity, question answering, sentiment analysis, and sentence compression. Deep learning, an AI technique that is being applied more and more, improves the functionality and robustness of the solutions for these tasks. Sentence similarity analysis and paraphrasing is often used to check the originality of a document and prevent plagiarism as well as helping in natural language understanding. Question answering improves customer services and can enhance administrative activities by allowing end-users to ask and get responses about different services and products in their preferred language. The ability of deep learning-based models to begin to handle this kind of diversity is starting to make communication between people from various corners of the world possible in their own language. The automated answering models advance the administrative task of the enterprises by reducing customer service costs as well as saving office time. Sentiment analysis quantifies subjective information, extracts affective states, and is widely used in identifying customer feedback such as survey responses, movie reviews, and healthcare materials. Text compression tries to create a representative summary or abstract of a text piece by finding the most informative concepts.
Research in the field of AI is currently attempting to achieve human-level performance on these aforementioned tasks. In order to achieve this performance level, it is crucial for computers to have an appropriate representation of the sentences. The term representation in this case means a set of numbers (i.e., a vector). Sentences having similar meaning should be represented by almost similar sets of numbers. This thesis develops methods to find good representations of sentences using the modern AI techniques, deep learning and reinforcement learning.
Ahmed, Mahtab, "Generating Effective Sentence Representations: Deep Learning and Reinforcement Learning Approaches" (2021). Electronic Thesis and Dissertation Repository. 7780.