Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization err...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...
This paper formalizes Feature Selection as a Reinforcement Learning problem, leading to a provably optimal though intractable selection policy. As a second contribution, this pape...
Monte Carlo simulation techniques that use function approximations have been successfully applied to approximately price multi-dimensional American options. However, for many pric...