Target-Based Offensive Language Identification

Abstract

We present TBO, a new dataset for Target-based Offensive language identification. TBO contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. TBO is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy. We use the TBO taxonomy to annotate post-level and token-level offensive language on English Twitter posts. We release an initial dataset of over 4,500 instances collected from Twitter and we carry out multiple experiments to compare the performance of different models trained and tested on TBO.

Publication DOI: https://doi.org/10.18653/v1/2023.acl-short.66
Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies > Applied AI & Robotics
College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
Additional Information: Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International License.
Event Title: 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Event Type: Other
Event Dates: 2023-07-09 - 2023-07-14
Uncontrolled Keywords: Computer Science Applications,Linguistics and Language,Language and Linguistics
ISBN: 9781959429715
Last Modified: 18 Nov 2024 08:56
Date Deposited: 24 Oct 2023 10:43
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
https://aclanth ... 3.acl-short.66/ (Publisher URL)
PURE Output Type: Conference contribution
Published Date: 2023-07-09
Accepted Date: 2023-07-01
Authors: Zampieri, Marcos
Morgan, Skye
North, Kai
Ranasinghe, Tharindu (ORCID Profile 0000-0003-3207-3821)
Simmons, Austin
Khandelwal, Paridhi
Rosenthal, Sara
Nakov, Preslav

Download

[img]

Version: Published Version

License: Creative Commons Attribution

| Preview

Export / Share Citation


Statistics

Additional statistics for this record