: This paper attempts to evaluate quantitatively the performance trade-offs involved in applying object-oriented software methodologies to region-based representations of images and image sequences. Spatial or temporal multiscale representations of these regions are used as an example for the different levels of granularity that can be encapsulated separately as self-contained objects within these images or scenes. These object granularity levels are evaluated by weighing representation overhead against object-manipulation functionalities and potential exploitation of concurrency.