Topical Phrase Extraction from Clinical Reports by Incorporating both Local and Global Context

Abstract

Making sense of words often requires to simultaneously examine the surrounding context of a term as well as the global themes characterizing the overall corpus. Several topic models have already exploited word embeddings to recognize local context, however, it has been weakly combined with the global context during the topic inference. This paper proposes to extract topical phrases corroborating the word embedding information with the global context detected by Latent Semantic Analysis, and then combine them by means of the Polya urn ´ model. To highlight the effectiveness of this combined approach the model was assessed analyzing clinical reports, a challenging scenario characterized by technical jargon and a limited word statistics available. Results show it outperforms the state-of-the-art approaches in terms of both topic coherence and computational cost.

Divisions: College of Engineering & Physical Sciences > Computer Science
College of Engineering & Physical Sciences > Systems analytics research institute (SARI)
College of Engineering & Physical Sciences > Mathematics
Additional Information: Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Event Title: 2018 Workshop on Health Intelligence (W3PHIAI 2018)
Event Type: Other
Event Dates: 2018-02-02 - 2018-02-03
Full Text Link:
Related URLs: https://aaai.or ... aper/view/16612 (Publisher URL)
PURE Output Type: Conference contribution
Published Date: 2018-06-20
Published Online Date: 2018-06-20
Accepted Date: 2017-11-20
Authors: Pergola, Gabriele
He, Yulan (ORCID Profile 0000-0003-3948-5845)
Lowe, David

Download

[img]

Version: Accepted Version

License: ["licenses_description_unspecified" not defined]

| Preview

Export / Share Citation


Statistics

Additional statistics for this record