Globally optimal learning rates in multilayer neural networks


A method for calculating the globally optimal learning rate in on-line gradient-descent training of multilayer neural networks is presented. The method is based on a variational approach which maximizes the decrease in generalization error over a given time frame. We demonstrate the method by computing optimal learning rates in typical learning scenarios. The method can also be employed when different learning rates are allowed for different parameter vectors as well as to determine the relevance of related training algorithms based on modifications to the basic gradient descent rule.

Publication DOI:
Divisions: College of Engineering & Physical Sciences > Systems analytics research institute (SARI)
Additional Information: Proceedings of the MINERVA workshop on Mesoscopics, Fractals and Neural Networks, 25-27 March 1997, Eilat (IL). This is an electronic version of an article published in Saad, David and Rattray, Magnus (1998). Globally optimal learning rates in multilayer neural networks. Philosophical Magazine Part B, 77 (5), pp. 1523-1530. Philosophical Magazine Part B is available online at:
Uncontrolled Keywords: optimal learning rate,gradient-descent,multilayer neural networks,variational approach,generalization error,gradient descent rule
Publication ISSN: 1364-2812
Last Modified: 02 Jan 2024 08:04
Date Deposited: 21 Sep 2009 16:36
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL) ... ue=5&spage=1523 (Publisher URL)
PURE Output Type: Article
Published Date: 1998-05
Authors: Saad, David (ORCID Profile 0000-0001-9821-2623)
Rattray, Magnus



Version: Accepted Version

Export / Share Citation


Additional statistics for this record