Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during training. However, although the training data does not define what an abnormality looks like, the main paradigm in this field is to directly search for individual abnormal local patches or image regions independent of another.
To address this problem we parse video frames by establishing a set of hypotheses that jointly explain all the foreground while, at same time, trying to find normal training samples that explain the hypotheses. Consequently, we can avoid a direct detection of abnormalities. They are discovered indirectly as those hypotheses which are needed for covering the foreground without finding an explanation by normal samp...