This paper presents two approaches for the representation and recognition of human action in video, aiming for viewpoint invariance. The paper first presents new results using a 2D approach presented earlier. Inherent limitations of the 2D approach are discussed and a new 3D approach that builds on recent work on 3D model-based invariants, is presented. Each action is represented as a unique curve in a 3D invariance-space, surrounded by an acceptance volume (`action-volume'). Given a video sequence, 2D quantities from the image are calculated and matched against candidate action volumes in a probabilistic framework. The theory is presented followed by results on arbitrary projections of motion-capture data which demonstrate a high degree of tolerance to viewpoint change.