We present an approach that uses Q-learning on individual robotic agents, for coordinating a missiontasked team of robots in a complex scenario. To reduce the size of the state space, actions are grouped into sets of related behaviors called roles and represented as behavioral assemblages. A role is a Finite State Automata such as Forager, where the behaviors and their sequencing for finding objects, collecting them, and returning them are already encoded and do not have to be relearned. Each robot starts out with the same set of possible roles to play, te same perceptual hardware for coordination, and no contact other than perception regarding other members of the team. Over the course of training, a team of Q-learning robots will converge to solutions that best the performance of a well-designed handcrafted homogeneous team.
Eric Martinson, Ronald C. Arkin