While exploring to nd better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, ...
Satinder P. Singh, Andrew G. Barto, Roderic A. Gru...
Although many real-world stochastic planning problems are more naturally formulated by hybrid models with both discrete and continuous variables, current state-of-the-art methods ...
Carlos Guestrin, Milos Hauskrecht, Branislav Kveto...
— We consider a collection of robots sharing a common environment, each robot constrained to move on a roadmap in its configuration space. To program optimal collision-free moti...
We describe a Markov chain method for sampling from the distribution of the hidden state sequence in a non-linear dynamical system, given a sequence of observations. This method u...
: We present a new approach for computing generalized Voronoi diagrams in two and three dimensions using interpolation-based polygon rasterization hardware. The input primitives ma...
Kenneth E. Hoff III, Tim Culver, John Keyser, Ming...