We describe and evaluate a system for learning domainspecific control knowledge. In particular, given a planning domain, the goal is to output a control policy that performs well on "long random walk" problem distributions. The system is based on viewing planning domains as very large Markov decision processes and then applying a recent variant of approximate policy iteration that is bootstrapped with a new technique based on random walks. We evaluate the system on the AIPS-2000 planning domains (among others) and show that often the learned policies perform well on problems drawn from the long