This paper describes an approach for generating Binary Partition Tree [7] representations and video object segmentations using a novel region merging strategy based on motion similarity measures of multiple frames of an image sequence. The system operates over colorhomogeneous regions, tracked across frames of a shot, representing an over-segmentation of the objects. A longterm motion similarity measure is introduced for region merging, offering accurate segmentation of objects and extending temporal consistency between the tracked partitions to hierarchical representations of every frame within the shot. Experimental results are presented, illustrating the usefulness of the approach.
Camilo C. Dorea, Ferran Marqués, Montse Par