Posterior Weighted Reinforcement Learning with State Uncertainty

13 years 11 months ago

Download www.maths.bris.ac.uk

Reinforcement learning models generally assume that a stimulus is presented that allows a learner to unambiguously identify the state of nature, and the reward received is drawn from a distribution that depends on that state. However in any natural environment the stimulus is noisy. When there is state uncertainty it is no longer immediately obvious how to perform reinforcement learning, since the observed reward cannot be unambiguously allocated to a state of the environment. This article addresses the problem of incorporating state uncertainty in reinforcement learning models. We show that simply ignoring the uncertainty and allocating the reward to the most likely state of the environment results in incorrect value estimates. Furthermore, using only the information that is available before observing the reward also results in incorrect estimates. We therefore introduce a new technique, posterior weighted reinforcement learning, in which the estimates of state probabilities are upda...

Tobias Larsen, David S. Leslie, Edmund J. Collins,

Real-time Traffic

NECO 2010 | Observed Rewards | Reinforcement | State Uncertainty |

claim paper

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	NECO
Authors	Tobias Larsen, David S. Leslie, Edmund J. Collins, Rafal Bogacz

Comments (0)

Sciweavers

Posterior Weighted Reinforcement Learning with State Uncertainty

NECO 2010 | Observed Rewards | Reinforcement | State Uncertainty |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers