One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
Partially observable Markov decision process (POMDP) is commonly used to model a stochastic environment with unobservable states for supporting optimal decision making. Computing ...
Abstract This paper studies the interaction of error and information both in a single-person setting and in an interactive setting. In contrast to Blackwell’s Theorem, which says...
With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, defies an exact solution bec...
We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow...
Yves Grandvalet, Alain Rakotomamonjy, Joseph Keshe...