This paper proposes an efficient computational technique for the optimal control of linear discrete-time systems subject to bounded disturbances with mixed polytopic constraints o...
We investigate methods for planning in a Markov Decision Process where the cost function is chosen by an adversary after we fix our policy. As a running example, we consider a rob...
H. Brendan McMahan, Geoffrey J. Gordon, Avrim Blum
—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...
Agents often have to construct plans that obey deadlines or, more generally, resource limits for real-valued resources whose consumption can only be characterized by probability d...
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...