Maximum Entropy Inverse Reinforcement Learning

15 years 9 months ago

Download www.andrew.cmu.edu

Recent research has shown the benefit of framing problems of imitation learning as solutions to Markov Decision Problems. This approach reduces learning to the problem of recovering a utility function that makes the behavior induced by a near-optimal policy closely mimic demonstrated behavior. In this work, we develop a probabilistic approach based on the principle of maximum entropy. Our approach provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods. We develop our technique in the context of modeling realworld navigation and driving behaviors where collected data is inherently noisy and imperfect. Our probabilistic approach enables modeling of route preferences as well as a powerful new approach to inferring destinations and routes based on partial trajectories.

Brian Ziebart, Andrew L. Maas, J. Andrew Bagnell,

Real-time Traffic

AAAI 2008 | Intelligent Agents | Mimic Demonstrated Behavior | Probabilistic Approach | Probabilistic Approach Enables |

claim paper

» Modeling Interaction via the Principle of Maximum Causal Entropy

» Planningbased prediction for pedestrians

» Computational Rationalization The Inverse Equilibrium Problem

» Multitask feature and kernel selection for SVMs

Post Info
More Details (n/a)

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2008
Where	AAAI
Authors	Brian Ziebart, Andrew L. Maas, J. Andrew Bagnell, Anind K. Dey

Comments (0)

Sciweavers

Maximum Entropy Inverse Reinforcement Learning

AAAI 2008 | Intelligent Agents | Mimic Demonstrated Behavior | Probabilistic Approach | Probabilistic Approach Enables |

Explore & Download

Productivity Tools

Sciweavers