Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
. Direct approaches, which involve asking patients various abstract questions, have significant drawbacks. We propose a new approach that infers patient preferences based on observ...
Zeynep Erkin, Matthew D. Bailey, Lisa M. Maillart,...
tigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-...
Milos Hauskrecht, Nicolas Meuleau, Leslie Pack Kae...
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Abstract. In parametric Markov Decision Processes (PMDPs), transition probabilities are not fixed, but are given as functions over a set of parameters. A PMDP denotes a family of ...