Latouche, P. (2007). Distribute Machine Learning. Masters thesis, Aston University.
Abstract
In the last ten years, there has been an ever increasing use of databases, to store information, and Machine Learning methods to manipulate, extract, and analyse data. More and more problems are being tackled in science, health and engineering. Asa consequence, there has been a concurrent increase in the use of highly distributed computing to store and manipulate data. In this thesis, we work on regression problems that consist of approximating underlying processes that map input variables to target variables. We introduce the concept of distributed learning environment where local agents use distributed data to train and we show that two critical applications can be tackled using such architectures. First ,in Chapter 3, we consider a situation where data is originally physically distributed on nodes. The agents do not agree to share their data for privacy and security reasons but do agree to share their models. In this environment, the issue is to combine the learned information in order to build a more accurate preditive model. For our experiments, we consider multilayer perceptrons and radial basis function networks. We test some model combination methods using a toy dataset and some scatterometry data. Then, in Chapter 4, we tackle Gaussian processes that are known to have a poor scaling with large data sets since they require matrix inversions of which the computational cost and memory requirement are of order O(N)3 and O(N2) respectively where N is the number of training data points. We investigate techniques that consist of splitting and then distributing the data on nodes. Thus, we show that the Bayesian committee machine can be applied to estimate Gaussian process predictions whereas a factorized hyperposterior can lead to optimization procedures over the whole training data set even if N is large. We experiment with these approximations using the scatterometry data.
Publication DOI: | https://doi.org/10.48780/publications.aston.ac.uk.00021478 |
---|---|
Divisions: | College of Engineering & Physical Sciences |
Additional Information: | Copyright © Latouche, P. 2007. P. Latouche asserts their moral right to be identified as the author of this thesis. This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without appropriate permission or acknowledgement. If you have discovered material in Aston Publications Explorer which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to patent, trademark, confidentiality, data protection, obscenity, defamation, libel, then please read our Takedown Policy and contact the service immediately. |
Institution: | Aston University |
Uncontrolled Keywords: | machine learning,information engineering |
Last Modified: | 15 May 2025 10:10 |
Date Deposited: | 19 Mar 2014 11:50 |
Completed Date: | 2007 |
Authors: |
Latouche, P.
|