The time required to simulate a complete benchmark program using the cycle-accurate model of a microprocessor can be prohibitively high. One of the proposed methodologies, representative sampling, addresses this problem by simulating only a group of unique phases in a program called simulation points. The methodology selects simulation points by characterizing each fixed chunk of instructions in the program using a feature called Basic Block Vector (BBV), clusters them into groups of similar chunks of instructions, and then selects a representative chunk of instructions from each cluster. The accuracy of this technique is highly dependent on the choice of the feature, clustering technique, and the distance measurement used for clustering. Previous research does not completely address all these aspects. In this paper, we propose a set of statistical metrics for making a comprehensive and fair evaluation of features, clustering algorithms, and distance measurements in representative sam...