Word Senses as Clusters of Meaning Modulations: A Computational Model of Polysemy
URL with Digital Object Identifier
Most words in natural languages are polysemous; that is, they have related but different meanings in different contexts. This one-to-many mapping of form to meaning presents a challenge to understanding how word meanings are learned, represented, and processed. Previous work has focused on solutions in which multiple static semantic representations are linked to a single word form, which fails to capture important generalizations about how polysemous words are used; in particular, the graded nature of polysemous senses, and the flexibility and regularity of polysemy use. We provide a novel view of how polysemous words are represented and processed, focusing on how meaning is modulated by context. Our theory is implemented within a recurrent neural network that learns distributional information through exposure to a large and representative corpus of English. Clusters of meaning emerge from how the model processes individual word forms. In keeping with distributional theories of semantics, we suggest word meanings are generalized from contexts of different word tokens, with polysemy emerging as multiple clusters of contextually modulated meanings. We validate our results against a human-annotated corpus of polysemy focusing on the gradedness, flexibility, and regularity of polysemous sense individuation, as well as behavioral findings of offline sense relatedness ratings and online sentence processing. The results provide novel insights into how polysemy emerges from contextual processing of word meaning from both a theoretical and computational point of view.