For a discrete-time finite-state Markov chain, we develop an adaptive importance sampling scheme to estimate the expected total cost before hitting a set of terminal states. This s...
Monte Carlo simulation techniques that use function approximations have been successfully applied to approximately price multi-dimensional American options. However, for many pric...
We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowledge of the POMDP, can estimate the returns for finit...
Typically agent evaluation is done through Monte Carlo estimation. However, stochastic agent decisions and stochastic outcomes can make this approach inefficient, requiring many s...
Michael H. Bowling, Michael Johanson, Neil Burch, ...
We propose an analysis of numerical integration based on sampling theory, whereby the integration error caused by aliasing is suppressed by pre-filtering. We derive a pre-filter f...