Control of exploitation-exploration meta-parameter in reinforcement learning

15 years 6 months ago

Download www.fil.ion.ucl.ac.uk

In reinforcement learning (RL), the duality between exploitation and exploration has long been an important issue. This paper presents a new method that controls the balance between exploitation and exploration. Our learning scheme is based on model-based RL, in which the Bayes inference with forgetting effect estimates the state-transition probability of the environment. The balance parameter, which corresponds to the randomness in action selection, is controlled based on variation of action results and perception of environmental change. When applied to maze tasks, our method successfully obtains good controls by adapting to environmental changes. Recently, Usher et al. [Science 283 (1999) 549] has suggested that noradrenergic neurons in the locus coeruleus may control the exploitation

Shin Ishii, Wako Yoshida, Junichiro Yoshimoto

Real-time Traffic

Environmental Changes | Model-based Rl | Neural Networks | NN 2002 | Usher Et Al |

claim paper

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	2002
Where	NN
Authors	Shin Ishii, Wako Yoshida, Junichiro Yoshimoto

Comments (0)

Sciweavers

Control of exploitation-exploration meta-parameter in reinforcement learning

Environmental Changes | Model-based Rl | Neural Networks | NN 2002 | Usher Et Al |

Explore & Download

Productivity Tools

Sciweavers