Currently biometric system performance is evaluated in terms of its FAR and FRR. The accuracy expressed in such a manner depends on the characteristics of the dataset on which the system has been tested. Using different datasets for system evaluation makes a true comparison of such systems difficult, more so in cases where the systems are designed to work on different biometrics, such as fingerprint and signature. We propose a similarity metric, calculated for a pair of fingerprint templates, which will quantize the “confusion” of a fingerprint matcher in evaluating them. We show how such a metric, can be calculated for an entire biometric database, to give us the amount of difficulty a matcher has, when evaluating fingerprints from that database. This similarity metric can be calculated in linear time, making it suitable for large datasets.