In this paper, we present a new solution to the problem of matching tracking sequences across different cameras. Unlike snapshot-based appearance matching which matches objects by a single image, we focus on sequence matching to alleviate the uncertainties brought by segmentation errors and partial occlusions. By incorporating multiple snapshots of the same object, the influence of the variation is alleviated. At the training stage, given the sequence of a queried person under one camera, the appearance model is formulated by concatenating feature vectors with the majority of votes over the sequence. At the testing stage, Bayesian inference is incorporated into the identification framework to accumulate the temporal information in the sequence. Experimental results demonstrate the effectiveness of the proposed method.