The method based on local features has an advantage that the important local motion feature is represented as bag-of-features, but lacks the location information. Additionally, in order to employ an approach based on bag-of-features, language models represented by pLSA and LDA (Latent Dirichlet Allocation) have to be applied to. These are unsupervised learning, but they require the number of latent topics to be set manually. In this study, in order to perform the LDA without specifying the number of the latent topics, and also to deal with multiple words concurrently, we propose unsupervised Multiple Instances Hierarchical Dirichlet Process MI-HDP-LDA by employing the local information concurrently. The proposed method, unsupervised MI-HDP-LDA, was evaluated for Weizmann dataset. The average recognition rate by