During the last decade, speaker verification systems have shown significant progress and have reached a level of performance and accuracy that support their utilization in practical applications, including the forensic ones. This context emphasizes the importance of a deeper analysis of the system’s performance over basic error rate. In this paper, the influence of the speaker (his/her ’voice’) on the performance is studied and the effect of the model (the training excerpt) is investigated. The experimental setup is based on an open source system and the experimental context of NIST-SRE 2008. The results confirm that the lower performances are obtained from a reduced number of speakers. Even more than speaker factor, speaker verification system performances are shown to be highly dependant on the voice samples used to train speaker models.