Building Relational World Models for Reinforcement Learning

14 years 6 months ago

Download ftp.cs.wisc.edu

Abstract. Many reinforcement learning domains are highly relational. While traditional temporal-difference methods can be applied to these domains, they are limited in their capacity to exploit the relational nature of the domain. Our algorithm, AMBIL, constructs relational world models in the form of relational Markov decision processes (MDPs). AMBIL works backwards from collections of high-reward states, utilizing inductive logic programming to learn their preimage, logical definitions of the region of state space that leads to the high-reward states via some acese learned preimages are chained together to form an MDP that abstractly represents the domain. AMBIL estimates the reward and transition probabilities of this MDP from past experience. Since our MDPs are small, AMBIL uses valueiteration to quickly estimate the Q-values of each action in the induced states and determine a policy. AMBIL is able to employ complex background knowledge and supports relational representations. Emp...

Trevor Walker, Lisa Torrey, Jude W. Shavlik, Richa

Real-time Traffic

Artificial Intelligence | ILP 2007 | Relational Markov Decision | Relational Nature | Relational World Models |

claim paper

Post Info
More Details (n/a)

Added	08 Jun 2010
Updated	08 Jun 2010
Type	Conference
Year	2007
Where	ILP
Authors	Trevor Walker, Lisa Torrey, Jude W. Shavlik, Richard Maclin

Comments (0)

Sciweavers

Building Relational World Models for Reinforcement Learning

Artificial Intelligence | ILP 2007 | Relational Markov Decision | Relational Nature | Relational World Models |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers