Biometrics-based authentication is becoming popular because of increasing ease-of-use and reliability. Performance evaluation of such systems is an important issue. We attempt to address two aspects of performance evaluation that have been conventionally neglected. First, the "difficulty" of the data that is used in a study influences the evaluation results. We propose some measures to characterize the data set so that the performance of a given system on different data sets can be compared. Second, conventional studies often have reported the false reject and false accept rates in the form of match score distributions. However, no confidence intervals are computed for these distributions, hence no indication of the significance of the estimates is given. In this paper, we systematically study and compare parametric and nonparametric (bootstrap) methods for measuring confidence intervals. We give special attention to false reject rate estimates.
Ruud M. Bolle, Sharath Pankanti, Nalini K. Ratha