Mining repeated patterns in television broadcast is important to advertisers in tracking a large number of television commercials. It can also benefit long-term archival of television because historically significant events are usually marked by repeated airing of the same video clips or sound-bytes. In this paper, we describe a system that can efficiently mine repeated patterns of arbitrary lengths from television broadcast. Compared with existing work, our system has two main innovations: first, our system is robust against minor temporal variations among repeated patterns. This is important as broadcasters often perform temporal editing on commercials so as to fit them into different time slots. Second, our system does not rely on any temporal segmentation algorithm, which may lead to over- or under-segmentation of important patterns. Instead, our system scans the television broadcast with a fixed-size sliding window, summarizes each window into a hash value, and maintains a ...
Sen-Ching S. Cheung, Thinh Nguyen