R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Partially observable Markov decision processes (POMDPs) allow one to model complex dynamic decision or control problems that include both action outcome uncertainty and imperfect ...
A Markov Decision Process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagen...
Abstract. With more and more electronic information sources becoming widely available, the issue of the quality of these often-competing sources has become germane. We propose a st...
In this paper we present the B-coder, an efficient binary arithmetic coder that performs extremely well on a wide range of data. The B-coder should be classed as an `approximate...