The integration of vision and natural languageprocessingincreasingly attracts attention in different areas of AI research. Up to now, however, there have only been a few attempts at connecting vision systems with natural language access systems. Within the SFB 314, special collaborative program on AI and knowledge-based systems, the automatic natural language description of real world image sequencesconstitutes a major research goal, which has been pursued during the last ten years. The aim of our approach is to obtain an incremental evaluation and simultaneous description of the perceived time-varying scenes. In this contribution we will report on new results of our joint efforts at combining the natural language accesssystem VITRA with a vision system. We have investigated the problem of describingthe movements of articulated bodiesin image sequences within an integrated natural language and computer vision system. The paper will focus on our model-based approach for the recognition ...