Instead of clustering video shots into scenes using low level image features, in this paper, we propose a rule-based model to extract simple dialog or action scenes. Through analyzing video editing rules and observing temporal appearance patterns of shots in dialog scenes of movies, we deduce a set of rules to recognize dialog or action scenes. Based on these rules, a finite state machine is designed to extract dialog or action scenes from videos automatically.
Lei Chen 0002, M. Tamer Özsu