— To realize natural human-robot interactions and investigate the developmental mechanism of human communication, an effective approach is to construct models by which a robot imitates cognitive functions of humans. Focusing on the knowledge that humans utilize motion information of others’ action, this paper presents a learning model that enables a robot to acquire the ability to establish joint attention with a human by utilizing both static and motion information. As the motion information, the robot uses the optical flow detected when observing a human who is shifting his/her gaze from looking at the robot to looking at another object. As the static information, it extracts the edge image of the human face when he/she is gazing at the object. The static and motion information have complementary characteristics. The former gives the exact direction of gaze, even though it is difficult to interpret. On the other hand, the latter provides a rough but easily understandable relati...