Partially observable decentralized decision making in robot teams is fundamentally different from decision making in fully observable problems. Team members cannot simply apply si...
Rosemary Emery-Montemerlo, Geoffrey J. Gordon, Jef...
For a discrete-time finite-state Markov chain, we develop an adaptive importance sampling scheme to estimate the expected total cost before hitting a set of terminal states. This s...
Robust tracking of abrupt motion is a challenging task
in computer vision due to the large motion uncertainty. In
this paper, we propose a stochastic approximation Monte
Carlo (...
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
There is a large literature on the rate of convergence problem for general unconstrained stochastic approximations. Typically, one centers the iterate n about the limit point then...