Abstract. This paper proposes a local motion-based approach for recognizing group activities in soccer videos. Given the SIFT keypoint matches on two successive frames, we propose a simple but effective method to group these keypoints into the background point set and the foreground point set. The former one is used to estimate camera motion and the latter one is applied to represent group actions. After camera motion compensation, we apply a local motion descriptor to characterize relative motion between corresponding keypoints on two consecutive frames. The novel descriptor is effective in representing group activities since it focuses on local motion of individuals and excludes noise such as background motion caused by inaccurate compensation. Experimental results show that our approach achieves high recognition rates in soccer videos and is robust to inaccurate compensation results.