Abakasanga, Emeka, Kousovista, Rania, Cosma, Georgina, Akbari, Ashley, Zaccardi, Francesco, Kaur, Navjot, Fitt, Danielle, Jun, Gyuchan Thomas, Kiani, Reza and Gangadharan, Satheesh (2025). Equitable hospital length of stay prediction for patients with learning disabilities and multiple long-term conditions using machine learning. Frontiers in Digital Health, 7 ,
Abstract
PURPOSE: Individuals with learning disabilities (LD) often face higher rates of premature mortality and prolonged hospital stays compared to the general population. Predicting the length of stay (LOS) for patients with LD and multiple long-term conditions (MLTCs) is critical for improving patient care and optimising medical resource allocation. However, there is limited research on the application of machine learning (ML) models to this population. Furthermore, approaches designed for the general population often lack generalisability and fairness, particularly when applied across sensitive groups within their cohort. METHOD: This study analyses hospitalisations of 9,618 patients with LD in Wales using electronic health records (EHR) from the SAIL Databank. A Random Forest (RF) ML model was developed to predict hospital LOS, incorporating demographics, medication history, lifestyle factors, and 39 long-term conditions. To address fairness concerns, two bias mitigation techniques were applied: a post-processing threshold optimiser and an in-processing reductions method using an exponentiated gradient. These methods aimed to minimise performance discrepancies across ethnic groups while ensuring robust model performance. RESULTS: The RF model outperformed other state-of-the-art models, achieving an area under the curve of 0.759 for males and 0.756 for females, a false negative rate of 0.224 for males and 0.229 for females, and a balanced accuracy of 0.690 for males and 0.689 for females. Bias mitigation algorithms reduced disparities in prediction performance across ethnic groups, with the threshold optimiser yielding the most notable improvements. Performance metrics, including false positive rate and balanced accuracy, showed significant enhancements in fairness for the male cohort. CONCLUSION: This study demonstrates the feasibility of applying ML models to predict LOS for patients with LD and MLTCs, while addressing fairness through bias mitigation techniques. The findings highlight the potential for equitable healthcare predictions using EHR data, paving the way for improved clinical decision-making and resource management.
| Publication DOI: | https://doi.org/10.3389/fdgth.2025.1538793 |
|---|---|
| Divisions: | College of Engineering & Physical Sciences College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies Aston University (General) |
| Funding Information: | The authors declare financial support was received for the research, authorship, and/or publication of this article. Data-driven machinE-learning aided stratification and management of multiple long-term COnditions in adults with intellectual disabilitiEs (DECODE) project (NIHR203981) is funded by the NIHR AI for Multiple Long-term Conditions (AIM) Programme. |
| Additional Information: | © 2025 Abakasanga, Kousovista, Cosma, Akbari, Zaccardi, Kaur, Fitt, Jun, Kiani and Gangadharan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
| Uncontrolled Keywords: | bias mitigation,exponentiated gradient,learning disabilities,length of stay,threshold optimiser,Medicine (miscellaneous),Biomedical Engineering,Health Informatics,Computer Science Applications |
| Publication ISSN: | 2673-253X |
| Data Access Statement: | The datasets presented in this article are not readily available as all proposals to use SAIL data are subject to review by the independent IGRP. The anonymised individual-level data sources used in this study are available in the SAIL Databank at Swansea University, Swansea, UK, Before any data can be accessed, approval must be given by the IGRP. |
| Last Modified: | 05 Feb 2026 08:42 |
| Date Deposited: | 03 Feb 2026 15:31 |
| Full Text Link: | |
| Related URLs: |
http://www.scop ... tnerID=8YFLogxK
(Scopus URL) https://www.fro ... 25.1538793/full (Publisher URL) |
PURE Output Type: | Article |
| Published Date: | 2025-02-14 |
| Accepted Date: | 2025-01-27 |
| Authors: |
Abakasanga, Emeka
(
0000-0002-4742-3102)
Kousovista, Rania Cosma, Georgina Akbari, Ashley Zaccardi, Francesco Kaur, Navjot Fitt, Danielle Jun, Gyuchan Thomas Kiani, Reza Gangadharan, Satheesh |
0000-0002-4742-3102