Abstract. In this paper we study distributed online learning of locomotion gaits for modular robots. The learning is based on a stochastic approximation method, SPSA, which optimizes the parameters of coupled oscillators used to generate periodic actuation patterns. The strategy is implemented in a distributed fashion, based on a globally shared reward signal, but otherwise utilizing local communication only. In a physicsbased simulation of modular Roombots robots we experiment with online learning of gaits and study the effects of: module failures, different robot morphologies, and rough terrains. The experiments demonstrate fast online learning, typically 5-30 min. for convergence to high performing gaits (≈ 30 cm/sec), despite high numbers of open parameters (45-54). We conclude that the proposed approach is efficient, effective and a promising candidate for online learning on many other robotic platforms.