Sciweavers

CIG
2006
IEEE

Monte-Carlo Go Reinforcement Learning Experiments

14 years 5 months ago
Monte-Carlo Go Reinforcement Learning Experiments
Abstract— This paper describes experiments using reinforcement learning techniques to compute pattern urgencies used during simulations performed in a Monte-Carlo Go architecture. Currently, Monte-Carlo is a popular technique for computer Go. In a previous study, Monte-Carlo was associated with domain-dependent knowledge in the Go-playing program Indigo. In 2003, a 3x3 pattern database was built manually. This paper explores the possibility of using reinforcement learning to automatically tune the 3x3 pattern urgencies. On 9x9 boards, within the Monte-Carlo architecture of Indigo, the result obtained by our automatic learning experiments is better than the manual method by a 3-point margin on average, which is satisfactory. Although the current results are promising on 19x19 boards, obtaining strictly positive results with such a large size remains to be done.
Bruno Bouzy, Guillaume Chaslot
Added 10 Jun 2010
Updated 10 Jun 2010
Type Conference
Year 2006
Where CIG
Authors Bruno Bouzy, Guillaume Chaslot
Comments (0)