A gloss composition and context clustering based distributed word sense representation model


In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned representations outperform the publicly available embeddings on half of the metrics in the word similarity task, 6 out of 13 sub tasks in the analogical reasoning task, and gives the best overall accuracy in the word sense effect classification task, which shows the effectiveness of our proposed distributed distribution learning model.

Publication DOI: https://doi.org/10.3390/e17096007
Divisions: College of Engineering & Physical Sciences > Systems analytics research institute (SARI)
?? 50811700Jl ??
Additional Information: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Uncontrolled Keywords: distributed representation,lexical semantic compositionality,natural language processing,word sense disambiguation,Physics and Astronomy(all)
Publication ISSN: 1099-4300
Last Modified: 10 Jun 2024 07:15
Date Deposited: 10 Nov 2015 09:35
Full Text Link: http://www.mdpi ... -4300/17/9/6007
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Article
Published Date: 2015-08-27
Authors: Chen, Tao
Xu, Ruifeng
He, Yulan (ORCID Profile 0000-0003-3948-5845)
Wang, Xuan



Version: Published Version

Export / Share Citation


Additional statistics for this record