In this paper, we propose a learning-based method for video super-resolution. There are two main contributions of the proposed method. First, information from cameras with different spatial-temporal resolutions is combined in our framework. This is achieved by constructing training dictionary using the high resolution images captured by still camera and the low resolution video is enhanced via searching in this customized database. Second, we enforce the spatio-temporal constraints using the conditional random field (CRF) and the problem of video super-resolution is posed as finding the high resolution video that maximizes the conditional probability. We apply the algorithm to video sequences taken from different scenes using cameras with different qualities and promising results are presented.