Training a good text detector requires a large amount of labeled data, which can be very expensive to obtain. Cotraining has been shown to be a powerful semi-supervised learning tool for solving many problems using a large amount of unlabeled data. However, augmented data from a co-training process could potentially degrade the performance of classifiers due to added noises from unlabeled data. This paper makes two contributions by proposing a modified co-training scheme for text detection. First, to get cleaner augmented data, the new algorithm integrates some authority knowledge of unlabeled data into co-training. Text recognition output of each selected unlabeled image patch is used as the authority that is combined with classifier prediction to decide if the sample will be added to the augmented set. Second, instead of evenly combining predictions of two co-training classifiers, a weighted combination is learned and used to produce the final prediction. Contributions of the new al...