The paper presents an evaluation of four clustering algorithms: k-means, average linkage, complete linkage, and Ward’s method, with the latter three being different hierarchical methods. The quality of the clusters created by the algorithms was measured in terms of cluster cohesiveness and semantic cohesiveness, and both quantitative and predicate-based similarity criteria were considered. Two similarity matrices were calculated as weighted sums of a set of selected MPEG-7 color feature descriptors (representing color, texture and shape), to measure the effectiveness of clustering subsets of COREL color photo images. The best quality clusters were formed by the averagelinkage hierarchical method. Even though weighted texture and shape similarity measures were used in addition to total color, average-linkage outperformed k-means in the formation of both semantic and cohesive clusters. Notably, though, the addition of texture and shape features degraded cluster quality for all three h...