Automatic shaping and decomposition of reward functions

16 years 7 months ago

Download www.machinelearning.org

This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begin by describing a method that learns a shaped reward function given a set of state and temporal abstractions. Next, we consider decomposition of the per-timestep reward in multieffector problems, in which the overall agent can be decomposed into multiple units that are concurrently carrying out various tasks. We show by example that to find a good reward decomposition, it is often necessary to first shape the rewards appropriately. We then give a function approximation algorithm for solving both problems together. Standard reinforcement learning algorithms can be augmented with our methods, and we show experimentally that in each case, significantly faster learning results.

Bhaskara Marthi

Real-time Traffic

Function Approximation Algorithm | ICML 2007 | Machine Learning | Reinforcement Learning Algorithms | Shaped Reward Function |

claim paper

» 3D Active Shape Models Using Gradient Descent Optimization of Description Length

» Procedural Modeling of Interconnected Structures

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2007
Where	ICML
Authors	Bhaskara Marthi

Comments (0)

Sciweavers

Automatic shaping and decomposition of reward functions

Function Approximation Algorithm | ICML 2007 | Machine Learning | Reinforcement Learning Algorithms | Shaped Reward Function |

Explore & Download

Productivity Tools

Sciweavers