Abstract— We propose a new approximate algorithm, LAJIV (Lookahead J-MDP Information Value), to solve Oracular Partially Observable Markov Decision Problems (OPOMDPs), a special ...
tnesses for Abstract Interpretation-based Proofs Fr´ed´eric Besson, Thomas Jensen, and Tiphaine Turpin IRISA/{Inria, CNRS, Universit´e de Rennes 1} Campus de Beaulieu, F-35042 R...
In a M/M/N+M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. O...
In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a fi...
This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...