Enhancing Linear B-cell Epitope Prediction Through Organism-Specific Training

Abstract

B-cell epitopes play a crucial role in immune responses, with their identification being a vital activity for numerous medical endeavours, including developing diagnostic tests, therapeutic antibodies, and vaccines. Linear B-cell epitopes (LBCE) are often prioritised as targets for epitope predictors over conformational epitopes due to the availability of data, lower experimental complexity for determination and their stability in various conditions, facilitating easier storage and transport. Despite advancements in computational techniques, existing LBCE prediction methods still exhibit suboptimal performance. This thesis explores the efficacy of organism-specific training in improving the accuracy and efficiency of linear B-cell epitope prediction models. Most LBCE prediction tools adopt a generalist approach, training models on large heterogeneous data sets from numerous organisms to develop predictors that are applicable across a wide variety of pathogens. In contrast, this work investigates the training of bespoke, tailored, organism-specific LBCE prediction models. The main hypothesis posits that using smaller, but potentially more directly relevant, organism-specific data sets for training could yield predictors that demonstrate superior predictive performance for new epitopes of the target organism over a single generalist model. The main research objectives of this work were: to investigate whether training linear B-cell epitope prediction models using organism-specific data leads to improved prediction performance compared to models trained on heterogeneous or hybrid data, and against well-established epitope predictors from the literature; And to investigate the limits of this organism-specific training approach by systematically quantifying the effect of the amount of training data on the performance of the models developed. Results indicate that organism-specific training significantly enhances the prediction performance of linear B-cell epitopes, even for organisms with limited training data. Comparative analysis demonstrates the superiority of organism-specific models over heterogenous, hybrid and other conventional predictors, highlighting the potential of tailored modelling approaches in epitope prediction.

Publication DOI: https://doi.org/10.48780/publications.aston.ac.uk.00046276
Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies > Applied AI & Robotics
Additional Information: Copyright © Jodie Sylvia May Ashford, 2023. Jodie Sylvia May Ashford asserts their moral right to be identified as the author of this thesis. This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without appropriate permission or acknowledgement. If you have discovered material in Aston Publications Explorer which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to patent, trademark, confidentiality, data protection, obscenity, defamation, libel, then please read our Takedown Policy and contact the service immediately.
Institution: Aston University
Uncontrolled Keywords: Epitope Prediction,Machine Learning,Computational Biology
Last Modified: 03 May 2024 13:08
Date Deposited: 03 May 2024 13:08
Completed Date: 2023-09
Authors: Ashford, Jodie S.M.

Export / Share Citation


Statistics

Additional statistics for this record