The WindowDiff evaluation measure [12] is becoming the standard criterion for evaluating text segmentation methods. Nevertheless, this metric is really not fair with regard to the characteristics of the methods and the results that it provides on different kinds of corpus are difficult to compare. Therefore, we first attempt to improve this measure according to the risks taken by each method on different kinds of text. On the other hand, the production of a segmentation of reference being a rather difficult task, this paper describes a new evaluation metric that relies on the stability of the segmentations face to text transformations. Our experimental results appear to indicate that both proposed metrics provide really better indicators of the text segmentation accuracy than existing measures.