Predicting goal probabilities with improved xG models using event sequences in association football

Abstract

In association football, predicting the likelihood and outcome of a shot at a goal is useful but challenging. Expected goal (xG) models can be used in a variety of ways including evaluating performance and designing offensive strategies. This study proposed a novel framework that uses the events preceding a shot, to improve the accuracy of the expected goals (xG) metric. A combination of previously explored and unexplored temporal features is utilized in the proposed framework. The new features include; “advancement factor”, and “player position column”. A random forest model was used, which performed better than published single-event-based models in the literature. Results further demonstrated a significant improvement in model performance with the inclusion of preceding event information. The proposed framework and model enable the discovery of event sequences that improve xG, which include; opportunities built up from the sides of the 18-yard box, shots attempted from in front of the goal within the opposition’s 18-yard box, and shots from successful passes to the far post.

Publication DOI: https://doi.org/10.1371/journal.pone.0312278
Divisions: College of Engineering & Physical Sciences > Aston Digital Futures Institute
Funding Information: This work is part of first Author’s PhD. PhD is funded by Deakin University, Melbourne under Deakin-Coventry cotutelle scholarship. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscrip
Additional Information: Copyright © 2024 Bandara et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Uncontrolled Keywords: Athletic Performance/physiology,Goals,Humans,Male,Probability,Soccer
Publication ISSN: 1932-6203
Data Access Statement: Dataset is a publicly available dataset collected by a third-party company Publicly available link to the dataset:https://github.com/statsbomb/open-data.
Last Modified: 19 Dec 2024 08:23
Date Deposited: 12 Nov 2024 17:37
Full Text Link:
Related URLs: https://journal ... al.pone.0312278 (Publisher URL)
http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Article
Published Date: 2024-10-30
Published Online Date: 2024-10-30
Accepted Date: 2024-10-03
Submitted Date: 2024-04-23
Authors: Bandara, Ishara
Shelyag, Sergiy
Rajasegarar, Sutharshan
Dwyer, Dan
Kim, Eun-jin
Angelova, Maia (ORCID Profile 0000-0002-0931-0916)

Download

[img]

Version: Published Version

License: Creative Commons Attribution

| Preview

Export / Share Citation


Statistics

Additional statistics for this record