A Bayesian formulation of search, control and the exploration/exploitation trade-off

Rohwer, Richard and Zhu, Huaiyu (1995). A Bayesian formulation of search, control and the exploration/exploitation trade-off. Technical Report. Aston University, Birmingham, UK.

Abstract

A new approach to optimisation is introduced based on a precise probabilistic statement of what is ideally required of an optimisation method. It is convenient to express the formalism in terms of the control of a stationary environment. This leads to an objective function for the controller which unifies the objectives of exploration and exploitation, thereby providing a quantitative principle for managing this trade-off. This is demonstrated using a variant of the multi-armed bandit problem. This approach opens new possibilities for optimisation algorithms, particularly by using neural network or other adaptive methods for the adaptive controller. It also opens possibilities for deepening understanding of existing methods. The realisation of these possibilities requires research into practical approximations of the exact formalism.

Divisions:	Aston University (General)
Uncontrolled Keywords:	optimisation,objective function,quantitative principle,approximations
ISBN:	NCRG/95/017
Last Modified:	12 Jan 2026 08:04
Date Deposited:	24 Sep 2009 13:52
PURE Output Type:	Technical report
Published Date:	1995-08-15
Authors:	Rohwer, Richard Zhu, Huaiyu

Download

A Bayesian formulation of search, control and the exploration/exploitation trade-off

Abstract

Download

Export / Share Citation

Explore Further

Statistics