We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
We introduce the generalized semi-Markov decision process (GSMDP) as an extension of continuous-time MDPs and semi-Markov decision processes (SMDPs) for modeling stochastic decisi...