For mobile robots to assist people in everyday life, they must be easy to instruct. This paper describes a gesture-based interface for human robot interaction, which enables people to instruct robots through easy-to-perform arm gestures. Such gestures might be static pose gestures, which involve only a specific configuration of the person's arm, or they might be dynamic motion gestures (such as waving). Gestures are recognized in real-time at approximate frame rate, using a hybrid approach that integrates neural networks and template matching. A fast, color-based tracking algorithm enables the robot to track and follow a person reliably through office environments with drastically changing lighting conditions. Results are reported in the context of an interactive clean-up task, where a person guides the robot to specific locations that need to be cleaned, and the robot picks up trash which it then delivers to the nearest trash-bin.