In this paper, we propose a framework for isolating text regions from natural scene images. The main algorithm has two functions: it generates text region candidates, and it verifies of the label of the candidates (text or non-text). The text region candidates are generated through a modified Kmeans clustering algorithm, which references texture features, edge information and color information. The candidate labels are then verified in a global sense by the Markov Random Field model where collinearity weight is added as long as most texts are aligned. The proposed method achieves reasonable accuracy for text extraction from moderately difficult examples from the ICDAR 2003 database.