We address the problem of autonomously learning controllers for visioncapable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for general metrics over state-action trajectories. We demonstrate the feasibility of our approach by successfully running our algorithm on a real mobile robot. The algorithm is novel and unique in that it (a) explores the environment and learns directly on a mobile robot without using a hand-made computer model as an intermediate step, (b) does not require manual discretization of the sensor input space, (c) works in piecewise continuous perceptual spaces, and (d) copes with partial observability. Together this allows learning from much less experience compared to previous methods. Keywords reinforcement learning; mobile robots. Also affiliated with: TU Munich, Boltzmannstr. 3, 85748 Garching, M
Viktor Zhumatiy, Faustino J. Gomez, Marcus Hutter,