Unsupervised event exploration from social text streams

Zhou, Deyu, Chen, Liangyu, Zhang, Xuan and He, Yulan (2017). Unsupervised event exploration from social text streams. Intelligent Data Analysis, 21 (4), pp. 849-866.

Abstract

Social media provides unprecedented opportunities for people to disseminate information and share their opinions and views online. Extracting events from social media platforms such as Twitter could help in understanding what is being discussed. However, event extraction from social text streams poses huge challenges due to the noisy nature of social media posts and dynamic evolution of language. We propose a generic unsupervised framework for exploring events on Twitter which consists of four major steps, filtering, pre-processing, extraction and categorization, and post-processing. Tweets published in a certain time period are aggregated and noisy tweets which do not contain newsworthy events are filtered by the filtering step. The remaining tweets are pre-processed by temporal resolution, part-of-speech tagging and named entity recognition in order to identify the key elements of events. An unsupervised Bayesian model is proposed to automatically extract the structured representations of events in the form of quadruples < entity, keyword, date, location > and further categorize the extracted events into event types. Finally, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 million tweets which were collected for one month in December 2010. A precision of 78.01% is achieved for event extraction using our proposed Bayesian model, outperforming a competitive baseline by nearly 13.6%. Moreover, events are also clustered into coherence groups with the automatically assigned event type labels with an accuracy of 42.57%.

Publication DOI: https://doi.org/10.3233/IDA-160048
Divisions: Engineering & Applied Sciences > Computer science
Engineering & Applied Sciences > Systems analytics research institute (SARI)
Engineering & Applied Sciences > Computer science research group
Additional Information: Copyright: 2017 – IOS Press and the authors. The final publication is available at IOS Press through http://dx.doi.org/10.3233/IDA-160048 Funding: This work was funded by the National Natural Science Foundation of China (61528302), the Natural Science Foundation of Jiangsu Province of China (BK20161430), the Innovate UK under the grant number 101779 and the Collaborative Innovation Center of Wireless Communications Technology.
Uncontrolled Keywords: Bayesian model,event extraction,social media,unsupervised learning,Theoretical Computer Science,Computer Vision and Pattern Recognition,Artificial Intelligence
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
Published Date: 2017-08-19
Authors: Zhou, Deyu
Chen, Liangyu
Zhang, Xuan
He, Yulan ( 0000-0003-3948-5845)

Download

[img]

Version: Accepted Version

| Preview

Export / Share Citation


Statistics

Additional statistics for this record