Abstract. This paper evaluates strategies to combine multiple segmentations of the same image, generated for example by different segmentation methods or by different human experts. Three methods are compared, each estimating and using a different level of prior knowledge about the segmenters. These three methods are: simple label averaging (no priors), a binary expectation maximization (EM) method with independent per-label priors [Warfield et al., MICCAI 2002], and a simultaneous multi-label EM method with across-label priors [Rohlfing et al., IPMI 2003]. The EM methods estimate the accuracies of the individual segmentations with respect to the (unknown) ground truth. These estimates, analogous to expert performance parameters, are then applied as weights in the actual combination step. In the case of the multi-label EM method, typical misclassification behavior, caused for example by neighborhood relationships of different tissues, is also modeled. A validation study using the MNI B...
Torsten Rohlfing, Daniel B. Russakoff, Calvin R. M