This paper describes a vision based pedestrian detection and tracking system which is able to count people in very crowded situations like escalator entrances in underground stations. The proposed system uses motion to compute regions of interest and prediction of movements, extracts shape information from the video frames to detect individuals, and applies texture features to recognize people. A search strategy creates trajectories and new pedestrian hypotheses and then filters and combines those into accurate counting events. We show that counting accuracies up to 98 % can be achieved.