When its human operator cannot continuously supervise (much less teleoperate) an agent, the agent should be able to recognize its limitations and ask for help when it risks making autonomous decisions that could significantly surprise and disappoint the operator. Inspired by previous research on making exploration-exploitation tradeoff decisions and on inverse reinforcement learning, we develop Expected Myopic Gain (EMG), a Bayesian approach where an agent explicitly models its uncertainty and how possible operator responses to queries could improve its decisions. With EMG, an agent can weigh the relative expected utilities of seeking operator help versus acting autonomously. We provide conditions under which EMG is optimal, and preliminary empirical results on simple domains showing that EMG can perform well even when its optimality conditions are violated. Keywords- Human-robot/agent Interaction, Planning, Value of Information
Robert Cohn, Michael Maxim, Edmund H. Durfee, Sati