In recent years, weblogs (or blogs) have received great popularity worldwide, among which video blogs (or vlogs) are playing an increasingly important role. As vlogs gain in population, how to make them more easily accessible has become a hot research topic. In this paper, we propose a novel automatic annotation model for vlogs. We extract informative keywords from both the target vlog itself and external resources which are semantically and visually relevant to it. We also present a new evaluation criterion, which assigns a score to an annotation according to its accuracy and completeness in representing the vlog’s semantics. Experimental results demonstrate the effectiveness of both the annotation model and evaluation criterion.