The need for empirical evaluation metrics and algorithms is well acknowledged in the field of computer vision. The process leads to precise insights to understanding current technological capabilities and also helps in measuring progress. Hence designing good and meaningful performance measures is very critical. In this paper, we propose two comprehensive measures, one each for detection and tracking, for video domains where an object bounding approach to ground truthing can be followed. Thorough analysis explaining the behavior of the measures for different types of detection and tracking errors are discussed. Face detection and tracking is chosen as a prototype task where such an evaluation is relevant. Results on real data comparing existing algorithms are presented and the measures are shown to be effective in capturing the accuracy of the detection/tracking systems.