We propose a novel semantic video summarization framework, which generates video skimmings that guarantee both the balanced content coverage and the visual coherence. First, we collect video semantic information with a semi-automatic video annotation tool. Secondly, we analyze the video structure and determine each video scene’s target skim length. Then, mutual reinforcement principle is used to compute the relative importance value and cluster the video shots according to their semantic descriptions. Finally, we analyze the arrangement pattern of the video shots, and the key shot arrangement patterns are extracted to form the final video skimming, where the video shot importance value is used as guidance. Experiments are conducted to evaluate the effectiveness of our proposed approach.
Shi Lu, Michael R. Lyu, Irwin King