We propose a novel method of greedy similarity measure to segment long spatial-temporal video sequences. Firstly, a principal curve of motion region along frames of a video sequence is constructed to represent trajectory. Then from the constructed principal curves of trajectories of predefined gestures, HMMs are applied to modeling them. For a long input video sequence, greedy similarity measure is established to automatically segment it into gestures along with gesture recognition, where true breakpoints of its principal curve are found by maximizing the joint probability of two successive candidate segments conditioned on the gesture models obtained from HMMs. The method is flexible, of high accuracy, and robust to noise due to the exploitation of principal curves, the combination of two successive candidate segments, and the simultaneous recognition. Experiments including comparison with two established methods demonstrate the effectiveness of the proposed method.