We state the problem of inverse reinforcement learning in terms of preference elicitation, resulting in a principled (Bayesian) statistical formulation. This generalises previous w...
Manyindustrial processes involve makingparts with an assemblyof machines, where each machinecarries out an operation on a part, and the finished product requires a wholeseries of ...
Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-sca...
Carlos Diuk, Alexander L. Strehl, Michael L. Littm...
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
We propose a new framework for aiding a reinforcement learner by allowing it to relocate, or move, to a state it selects so as to decrease the number of steps it needs to take in ...