Abstract— Imitation learning in robots, also called programing by demonstration, has made important advances in recent years, allowing humans to teach context dependant motor skills/tasks to robots. We propose to extend the usual contexts investigated to also include acoustic linguistic expressions that might denote a given motor skill, and thus we target joint learning of the motor skills and their potential acoustic linguistic name. In addition to this, a modification of a class of existing algorithms within the imitation learning framework is made so that they can handle the unlabeled demonstration of several tasks/motor primitives without having to inform the imitator of what task is being demonstrated or what the number of tasks are, which is a necessity for language learning, i.e; if one wants to teach naturally an open number of new motor skills together with their acoustic names. Finally, a mechanism for detecting whether or not linguistic input is relevant to the task is al...