Mobile robots often rely upon systems that render sensor data and perceptual features into costs that can be used in a planner. The behavior that a designer wishes the planner to execute is often clear, while specifying costs that engender this behavior is a much more difficult task. This is particularly apparent when attempting to simultaneously tune many parameters that define the mapping from features to resulting plans. We provide a novel, structured maximum margin approach to learning based on example trajectories demonstrated by a human. The learning problem is transformed into a convex optimization problem and we provide a simple, efficient algorithm that leverages fast planning methods. Finally, we demonstrate the algorithms performance on learning to map features to plans on two different types of input features.
Nathan D. Ratliff, J. Andrew Bagnell, Martin Zinke