High dimensionality of belief space in DEC-POMDPs is one of the major causes that makes the optimal joint policy computation intractable. The belief state for a given agent is a p...
—We investigate the problem of minimizing the overall transmission delay of data packets in a single-user wireless communication system, where the transmitter has a fixed amount...
The resource constrained shortest path problem (CSP) asks for the computation of a least cost path obeying a set of resource constraints. The problem is NP-complete. We give theore...
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...
Abstract. In this paper, we present a constraint-partitioning approach for finding local optimal solutions of large-scale mixed-integer nonlinear programming problems (MINLPs). Ba...