A Monte-Carlo AIXI Approximation

13 years 7 months ago

Download www.hutter1.net

This paper describes a computationally feasible approximation to the AIXI agent, a universal reinforcement learning agent for arbitrary environments. AIXI is scaled down in two key ways: First, the class of environment models is restricted to all prediction suﬃx trees of a ﬁxed maximum depth. This allows a Bayesian mixture of environment models to be computed in time proportional to the logarithm of the size of the model class. Secondly, the ﬁnite-horizon expectimax search is approximated by an asymptotically convergent Monte Carlo Tree Search technique. This scaled down AIXI agent is empirically shown to be eﬀective on a wide class of toy problem domains, ranging from simple fully observable games to small POMDPs. We explore the limits of this approximate agent and propose a general heuristic framework for scaling this technique to much larger problems. Contents

Joel Veness, Kee Siong Ng, Marcus Hutter, William

Real-time Traffic

AIXI Agent | Environment Models | JAIR 2011 | ﬁxed Maximum Depth |

claim paper

Post Info
More Details (n/a)

Added	14 May 2011
Updated	14 May 2011
Type	Journal
Year	2011
Where	JAIR
Authors	Joel Veness, Kee Siong Ng, Marcus Hutter, William T. B. Uther, David Silver

Comments (0)

Sciweavers

A Monte-Carlo AIXI Approximation

AIXI Agent | Environment Models | JAIR 2011 | ﬁxed Maximum Depth |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers