Abstract. Crowd behavior recognition is becoming an important research topic in video surveillance for public places. In this paper, we first discuss the crowd feature selection and extraction and propose a multiple-frame feature point detection and tracking based on the KLT tracker. We state that behavior modelling of crowd is usually coarse compared to that for individuals. Instead of developing general crowd behavior models, we propose to model crowd events for specific end-user scenarios. As a result, a same type of event may be modelled slightly differently from one scenario to another and several models are to be defined. Consequently, fast modelling is required and this is enabled by the use of an extended Scenario Recognition Engine (SRE) in our approach. Crowd event models are defined; particularly, composite events accommodating evidence accumulation allow to increase detection reliability. Tests have been conducted on real surveillance video sequences containing crowd s...