Electronic Thesis and Dissertation Repository

Degree

Master of Science

Program

Computer Science

Supervisor

Robert Mercer

Abstract

Synonymy is an important part of all natural language but not all synonyms are created equal. Just because two words are synonymous, it usually doesn’t mean they can always be interchanged. The problem that we attempt to address is that of near-synonymy and choosing the right word based purely on its surrounding words. This new computational method, unlike previous methods used on this problem, is capable of making multiple word suggestions which more accurately models human choice. It contains a large number of words, does not require training, and is able to be run in real-time. On previous testing data, when able to make multiple suggestions, it improved by over 17 percentage points on the previous best method and 4.5 percentage points on average, with a maximum of 14 percentage points, on the human annotators near-synonym choice. In addition this thesis also presents new synonym sets and human annotated test data that more accurately fits this problem.


Share

COinS