In this paper, a framework that combines feature extraction, model learning, and likelihood computation, is presented for video event detection. First, the independent component analysis (ICA) is applied to the raw feature space to extract the spatial features. Then, a framework based on ICA mixture hidden Markov models (ICAMHMM) is used to exploit the spatial and temporal characteristics of the training data. After the model is learnt, the likelihood for a given video sequence is computed and then used to classify the video into a semantic event. Golf video sequences are used for simulations. The results show that the proposed method can effectively detect semantic video events.