In this paper, we propose a system identification approach for group activity recognition in traffic surveillance. Statistical shape theory is used to extract features, and then ARMA (Autoregressive and Moving Average) is adopted for feature learning and activity identification. Here only a few points, instead of the complete trajectory of each object are used to describe the dynamic information of group activity. And ARMA is employed to learn activity sequences. The performance of the proposed method is proved by experiments on 570 video sequences, with the average recognition rate of 88% (compared with 81% of HMM). The extracted features are invariant to zoom, pan and tilt, which is also proved in the experiments.