A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...
An issue that is critical for the application of Markov decision processes MDPs to realistic problems is how the complexity of planning scales with the size of the MDP. In stochas...
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
We propose a novel algorithm called GA-MDP for solving the frequency assigment problem. GA-MDP inherits the spirit of genetic algorithms with an adaptation of Markov Decision Proc...
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for so...