Making sense of microposts (#MSM2013) concept extraction challenge

Abstract

Microposts are small fragments of social media content that have been published using a lightweight paradigm (e.g. Tweets, Facebook likes, foursquare check-ins). Microposts have been used for a variety of applications (e.g., sentiment analysis, opinion mining, trend analysis), by gleaning useful information, often using third-party concept extraction tools. There has been very large uptake of such tools in the last few years, along with the creation and adoption of new methods for concept extraction. However, the evaluation of such efforts has been largely consigned to document corpora (e.g. news articles), questioning the suitability of concept extraction tools and methods for Micropost data. This report describes the Making Sense of Microposts Workshop (#MSM2013) Concept Extraction Challenge, hosted in conjunction with the 2013 World Wide Web conference (WWW'13). The Challenge dataset comprised a manually annotated training corpus of Microposts and an unlabelled test corpus. Participants were set the task of engineering a concept extraction system for a defined set of concepts. Out of a total of 22 complete submissions 13 were accepted for presentation at the workshop; the submissions covered methods ranging from sequence mining algorithms for attribute extraction to part-of-speech tagging for Micropost cleaning and rule-based and discriminative models for token classification. In this report we describe the evaluation process and explain the performance of different approaches in different contexts.

Additional Information: Cano Basave, AE, Varga, A, Rowe, M, Stankovic, M & Dadzie, A-S: Making sense of microposts (#MSM2013) concept extraction challenge. Proc. of the workshop on 'Making Sense of Microposts' co-located with the 22nd international World Wide Web conference (WWW'13), Rio de Janeiro, Brazil, 13 May, ceur-ws.org/Vol-1019/msm2013-challenge-report.pdf
Event Title: Making sense of microposts
Event Type: Other
Event Dates: 2013-05-13
Uncontrolled Keywords: General Computer Science
Last Modified: 26 Aug 2024 10:24
Date Deposited: 03 Nov 2015 11:55
Full Text Link: http://ceur-ws. ... enge-report.pdf
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Conference contribution
Published Date: 2013
Authors: Cano Basave, Amparo Elizabeth
Varga, Andrea
Rowe, Matthew
Stankovic, Milan
Dadzie, Aba-Sah

Download

[img]

Version: Accepted Version


Export / Share Citation


Statistics

Additional statistics for this record