A k-NN Based Perception Scheme for Reinforcement Learning

15 years 1 months ago

Download www.dia.fi.upm.es

Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online” interaction with the environment. In this sense this is the key diﬀerence from supervised machine learning in which the learner is told what actions to take. Instead of that, in RL the agent (learner) acts autonomously and only receives a scalar reward signal that is used for evaluate what so good is the actual behavioral policy. The framework of RL is designed to guide the learner in maximizing the average reward in the long run. One of the consequences of this learning paradigm is that the agent must explore new behavioral policies because there is no supervisor that tell what actions to do, thus, the trade oﬀ between exploration and exploitation is a key characteristic of RL. Typically, exploration procedures selects actions following a random distribution in order to gain more knowledge of the env...

José Antonio Martin H., Javier de Lope Asia

Real-time Traffic

Environment Reward Function | EUROCAST 2007 | Machine Learning | Scalar Reward Signal |

claim paper

Post Info
More Details (n/a)

Added	07 Jun 2010
Updated	07 Jun 2010
Type	Conference
Year	2007
Where	EUROCAST
Authors	José Antonio Martin H., Javier de Lope Asiaín

Comments (0)

Sciweavers

A k-NN Based Perception Scheme for Reinforcement Learning

Environment Reward Function | EUROCAST 2007 | Machine Learning | Scalar Reward Signal |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers