Video event understanding requires a formalism that can model complex logical temporal and spatial relations between composing sub-events. In this paper we argue that the Petri-Net is such a formalism. We go on to define a methodology for constructing Petri-Net event models from semantic descriptions of events in two well known video event ontology standards, VERL and CASEE .