We analyze the convergence of randomized trace estimators. Starting at 1989, several algorithms have been proposed for estimating the trace of a matrix by 1 M M i=1 zT i Azi, where the zi are random vectors, have been proposed; different estimators use different distributions for the zis, all of which lead to E( 1 M M i=1 zT i Azi) = trace(A). These algorithms are useful in applications in which there is no explicit representation of A but rather an efficient method compute zT Az given z. Existing results only analyze the variance of the different estimators. In contrast, we analyze the number of samples M required to guarantee that with probability at least 1 − δ, the relative error in the estimate is at most . We argue that such bounds are much more useful in applications than the variance. We found that these bounds rank the estimators differently than the variance; this suggests that minimum-variance estimators may not be the best. We also make two additional contributions t...