We present a generalized extensible framework for automated recognition of swarming activities in video sequences. The trajectory of each individual is produced by the visual tracking sub-system and is further analyzed to detect certain types of high-level grouping behavior. We utilize recent findings in swarming behavior analysis to formulate a problem in terms of the specific distance function that we subsequently apply as part of the two-stage agglomerative clustering method to create a set of swarming events followed by a set of swarming activities. In this paper we present results for one particular type of swarming: shopper grouping. As part of this work the events detected in a relatively short time interval are further integrated into activities, the manifestation of prolonged high-level swarming behavior. The results demonstrate the ability of our method to detect such activities in congested surveillance videos. In particular in three hours of indoor retail store video, ou...