For video summarization and retrieval, one of the important modules is to group temporal-spatial coherent shots into high-level semantic video clips namely scene segmentation. In this paper, we propose a novel scene segmentation and categorization approach using normalized graph cuts(NCuts). Starting from a set of shots, we first calculate shot similarity from shot key frames. Then by modeling scene segmentation as a graph partition problem where each node is a shot and the weight of edge represents the similarity between two shots, we employ NCuts to find the optimal scene segmentation and automatically decide the optimum scene number by Q function. To discover more useful information from scenes, we analyze the temporal layout patterns of shots, and automatically categorize scenes into two different types, i.e. parallel event scenes and serial event scenes. Extensive experiments are tested on movie, and TV series. The promising results demonstrate that the proposed NCuts based scene...