In this paper, a region-based spatio-temporal Markov random field (STMRF) model is proposed to segment moving objects semantically. The STMRF model combines segmentation results of four successive frames and integrates the temporal continuity in the uniform energy function. The segmentation procedure is composed of two stages: one is the short-term's classification and the other is temporal integration. At the first stage, moving objects are extracted by a region-based MRF model between two frames in a frame group of four successive frames. At the second stage, the ultimate semantic object is labeled by minimization the energy function of the STMRF model. Such phased segmentation process is corresponding to a multi-level simulated anneal strategy. Experimental results show that the proposed algorithm can efficiently capture the motion semantic meaning of objects and accurately extract moving objects.