Abstract. The brain representations of words and their referent actions and objects appear to be strongly coupled neuronal assemblies distributed over several cortical areas. In this work we describe the implementation of a cell assembly-based model of several visual, language, planning, and motor areas to enable a robot to understand and react to simple spoken commands. The essential idea is that different cortical areas represent different aspects of the same entity, and that the longrange cortico-cortical projections represent hetero-associative memories that translate between these aspects or representations.