The UCT algorithm has been exceedingly popular for Go, a two-player game, significantly increasing the playing strength of Go programs in a very short time. This paper provides an ...
Abstract. Monte-Carlo tree search, especially the UCT algorithm and its enhancements, have become extremely popular. Because of the importance of this family of algorithms, a deepe...
The UCT algorithm learns a value function online using sample-based search. The TD() algorithm can learn a value function offline for the on-policy distribution. We consider three...