Numerous projects have evaluated the performance of CORBA middleware over the past decade. Interestingly, many of the published results are either gathered or analyzed imprecisely. We point out common causes of such imprecision related to the gathering of timing information and the effects of warm-up, randomization, cross talk and delayed or hidden functionality, and demonstrate their impact on the results of the evaluation. We also present suggestions related to reporting the results in a manner that is more relevant to the evaluation.