Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

180

ECML
2006
Springer

112views Machine Learning» more ECML 2006»

Bandit Based Monte-Carlo Planning

15 years 10 months ago

Bandit Based Monte-Carlo Planning

Download www.lri.fr

Abstract. For large state-space Markovian Decision Problems MonteCarlo planning is one of the few viable approaches to find near-optimal solutions. In this paper we introduce a new algorithm, UCT, that applies bandit ideas to guide Monte-Carlo planning. In finite-horizon or discounted MDPs the algorithm is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling. Experimental results show that in several domains, UCT is significantly more efficient than its alternatives.

Levente Kocsis, Csaba Szepesvári

Real-time Traffic

ECML 2006 | Finite Sample Bounds | Machine Learning | Markovian Decision Problems | Viable Approaches |

claim paper

Related Content

» Boosting Active Learning to Optimality A Tractable MonteCarlo BilliardBased Algorithm

» BanditBased Genetic Programming

» On Using Monte Carlo Methods for Scheduling

» FPGAbased Monte Carlo Computation of Light Absorption for Photodynamic Cancer Therapy

» NanoparticleEnhanced Proton Computed Tomography A Monte Carlo Simulation Study

» MCDB a monte carlo approach to managing uncertain data

» A computational approximation to the AIXI model

» Influence of Node Location Distributions on the Structure of Ad Hoc and Mesh Networks

» Towards Procedural Strategy Game Generation Evolving Complementary Unit Types

Post Info
More Details (n/a)

Added	22 Aug 2010
Updated	22 Aug 2010
Type	Conference
Year	2006
Where	ECML
Authors	Levente Kocsis, Csaba Szepesvári

Comments (0)