It has always been very difficult to recognize realistic actions from unconstrained videos because there are tremendous variations from camera motion, background clutter, object appearance and so on. In this paper, a SingleFeature Hierarchical Latent Dirichlet Allocation model called SF-HLDA by extending Latent Dirichlet Allocation to the hierarchical one is first proposed for realistic action recognition. And then, by extending SF-HLDA, we present another model called Multi-Feature Hierarchical Latent Dirichlet Allocation model MF-HLDA which can effectively fuse several different features into one model for recognizing the realistic actions. Experiments demonstrate the effectiveness of our proposed models.