There has been a lot of recent work on Bayesian methods for reinforcement learning exhibiting near-optimal online performance. The main obstacle facing such methods is that in most...
The Web is based on a browsing paradigm that makes it di cult to retrieve and integrate data from multiple sites. Today, the only way to do this is to build specialized applicatio...
In this paper, we describe a method for pedagogical agents to choose when to interact with learners in interactive learning environments. This method is based on observations of h...
Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan...
In this paper, we investigate the use of parallelization in reinforcement learning (RL), with the goal of learning optimal policies for single-agent RL problems more quickly by us...