Coherent Inference on Optimal Play in Game Trees

13 years 7 months ago

Download jmlr.csail.mit.edu

Round-based games are an instance of discrete planning problems. Some of the best contemporary game tree search algorithms use random roll-outs as data. Relying on a good policy, they learn on-policy values by propagating information upwards in the tree, but not between sibling nodes. Here, we present a generative model and a corresponding approximate message passing scheme for inference on the optimal, off-policy value of nodes in smooth and/or trees, given random roll-outs. The crucial insight is that the distribution of values in game trees is not completely arbitrary. We define a generative model of the on-policy values using a latent score for each state, representing the value under the random roll-out policy. Inference on the values under the optimal policy separates into an inductive, pre-data step and a deductive, post-data part. Both can be solved approximately with Expectation Propagation, allowing off-policy value inference for any node in the (exponentially big) tree in l...

Philipp Hennig, David H. Stern, Thore Graepel

Real-time Traffic

Game Trees | Generative Model | JMLR 2010 | Random Roll-out |

claim paper

Post Info
More Details (n/a)

Added	19 May 2011
Updated	19 May 2011
Type	Journal
Year	2010
Where	JMLR
Authors	Philipp Hennig, David H. Stern, Thore Graepel

Comments (0)

Sciweavers

Coherent Inference on Optimal Play in Game Trees

Game Trees | Generative Model | JMLR 2010 | Random Roll-out |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers