Designing distributed controllers for self-reconfiguring modular robots has been consistently challenging. We have developed a reinforcement learning approach which can be used both to automate controller design and to adapt robot behavior online. In this paper, we report on our study of reinforcement learning in the domain of selfreconfigurable modular robots: the underlying assumptions, the applicable algorithms, and the issues of partial observability, large search spaces and local optima. We propose and validate experimentally in simulation a number of techniques designed to address these and other scalability issues that arise in applying machine learning to distributed systems such as modular robots. We discuss ways to make learning faster, more robust and amenable to online application by giving scaffolding to the learning agents in the form of policy representation, structured experience and additional information. With enough structure modular robots can run learning algorith...