In this paper, we study the following basic problem: After having executed a sequence of actions, find a sequence of actions that brings the agent back to the state just before th...
It has long been recognized that users can have complex preferences on plans. Non-intrusive learning of such preferences by observing the plans executed by the user is an attracti...
Nan Li, William Cushing, Subbarao Kambhampati, Sun...
Partially observable Markov decision processes (POMDPs) provide a principled, general framework for robot motion planning in uncertain and dynamic environments. They have been app...
Sylvie C. W. Ong, Shao Wei Png, David Hsu, Wee Sun...
Autonomous agents that learn about their environment can be divided into two broad classes. One class of existing learners, reinforcement learners, typically employ weak learning ...
Abstract— Real-world robotic environments are highly structured. The scalability of planning and reasoning methods to cope with complex problems in such environments crucially de...