Whose Tweet? Authorship analysis of micro-blogs and other short-form messages

Abstract

Approaches to authorship attribution have traditionally been constrained by the size of the message to which they can be successfully applied, making them unsuitable for analysing shorter messages such as SMS Text Messages, micro-blogs (e.g. Twitter) or Instant Messaging. Having many potential authors of a number of texts (as in, for example, an online context) has also proved problematic for traditional descriptive methods, which have tended to be successfully applied in cases where there is a small and closed set of possible authors. This paper reports the findings of a project which aimed to develop and automate techniques from forensic linguistics that have been successfully applied to the analysis of short message content in criminal cases. Using data drawn from UK-focused online groups within Twitter, the research extends the applicability of Grant’s (2007; 2010) stylistic and statistical techniques for the analysis of authorship of short texts into the online environment. Initial identification of distinctive textual features commonly found within short messages allows for the development of a taxonomy which can then be used when calculating the ‘distance’ between messages containing instances of these feature types. The end result is an automated process with a high level of success in assigning tweets to the correct author. The research has the potential to extend the scope of reliable and valid authorship analysis into hitherto unexplored contexts. Given the relative anonymity of the internet and the availability of cloaking technology, linguistic research of this nature represents a crucial contribution to the investigative toolkit.

Divisions: ?? 53981500Jl ??
College of Business and Social Sciences > Aston Institute for Forensic Linguistics
College of Business and Social Sciences > School of Social Sciences & Humanities
College of Business and Social Sciences > School of Social Sciences & Humanities > Centre for Language Research at Aston (CLaRA)
Additional Information: © Copyright remains solely with individual authors
Event Title: 10th biennial conference International Association of Forensic Linguists
Event Type: Other
Event Dates: 2011-07-11 - 2011-07-14
ISBN: 9781854494320
Last Modified: 18 Feb 2025 08:24
Date Deposited: 13 Jun 2013 10:57
PURE Output Type: Conference contribution
Published Date: 2012
Authors: Macleod, Nicci (ORCID Profile 0000-0002-6642-5509)
Grant, Tim (ORCID Profile 0000-0002-5155-8413)

Download

Export / Share Citation


Statistics

Additional statistics for this record