As robots enter the human environment and come in contact with inexperienced users, they need to be able to interact with users in a multi-modal fashion—keyboard and mouse are no longer acceptable as the only input modalities. This paper introduces a novel approach to program a robot interactively through a multi-modal interface. The key characteristic of this approach is that the user can provide feedback interactively at any time—during both the programming and the execution phase. The framework takes a three-step approach to the problem: multi-modal recognition, intention interpretation, and prioritized task execution. The multi-modal recognition module translates hand gestures and spontaneous speech into a structured symbolic data stream without ing away the user’s intent. The intention interpretation module selects the appropriate primitives to generate a task based on the user’s input, the system’s current state, and robot sensor data. Finally, the prioritized task exe...
Soshi Iba, Christiaan J. J. Paredis, Pradeep K. Kh