Partially Observable Markov Decision Processes (POMDPs) provide an appropriately rich model for agents operating under partial knowledge of the environment. Since finding an opti...
Yan Virin, Guy Shani, Solomon Eyal Shimony, Ronen ...
This paper investigates the problem of automatically learning how to restructure the reward function of a Markov decision process so as to speed up reinforcement learning. We begi...
One-clock priced timed games is a class of two-player, zero-sum, continuous-time games that was defined and thoroughly studied in previous works. We show that One-clock priced ti...
Thomas Dueholm Hansen, Rasmus Ibsen-Jensen, Peter ...
Planning methods for deterministic planning problems traditionally exploit factored representations to encode the dynamics of problems in terms of a set of parameters, e.g., the l...
Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods ...
Carlos Guestrin, Milos Hauskrecht, Branislav Kveto...