In this paper, we propose a multimodal system for detecting human activity and interaction patterns in a nursing home. Activities of groups of people are firstly treated as interaction patterns between any pair of partners and are then further broken into individual activities and behavior events using a multi-level context hierarchy graph. The graph is implemented using a dynamic Bayesian network to statistically model the multi-level concepts. We have developed a coarse-to-fine prototype system to illustrate the proposed concept. Experimental results have demonstrated the feasibility of the proposed approaches. The objective of this research is to automatically create concise and comprehensive reports of activities and behaviors of patients to support physicians and caregivers in a nursing facility. Categories and Subject Descriptors I.4.8 [Scene analysis]: motion, color, shape, tracking, stereo General Terms Algorithms Keywords Multimodal, human interaction, group activity, medical...