Compared with the video programs taken by professionals, home videos are always with low-quality content resulted from lack of professional capture skills. In this paper, we present a novel spatio-temporal quality assessment scheme in terms of low-level content features for home videos. In contrast to existing frame-level-based quality assessment approaches, a type of temporal segment of video, sub-shot, is selected as the basic unit for quality assessment. A set of spatio-temporal artifacts, regarded as the key factors affecting the overall perceived quality (i.e. unstableness, jerkiness, infidelity, blurring, brightness and orientation), are mined from each sub-shot based on the particular characteristics of home videos. The relationship between the overall quality metric and these factors are exploited by three different methods, including user study, factor fusion, and a learningbased scheme. To validate the proposed scheme, we present a scalable quality-based home video summar...