Visual analysis of human behavior has generated considerable interest in the field of computer vision because of the wide spectrum of potential applications. In this paper, we present a language modeling framework for understanding human behavior. The proposed framework consists of two modules: the key posture selection module, and the variable-length Markov model (VLMM) behavior recognition module. A key posture selection algorithm is developed based on the shape context matching technique. A codebook is then constructed with the computed key postures and used to convert input image sequences into training symbol sequences or recognition symbol sequences. Finally, a VLMM is applied to learn and recognize the constructed symbol sequences corresponding to human behavior patterns. Experiments on real data demonstrate the efficacy of the proposed system.