Text detection and recognition in video is challenging due to the presence of different types of texts, namely, graphics (video caption), scene (natural text), 2D, 3D, static and dynamic texts. Developing a universal method that works well for all these types is hard. In this paper, we propose a novel method for classifying graphics-scene and 2D-3D texts in video to enhance text detection and recognition accuracies. We first propose an iterative method to classify static text and dynamic text clusters based on the fact that static texts have zero velocity while dynamic texts do not. This results in text candidates for both static and dynamic texts regardless of 2D and 3D types. We then propose symmetry detection for text candidates using stroke width distances and medial axis values. This process gives rise to potential text candidates. We group potential text candidates using their geometrical properties to form text regions. Next, for each text region, we study the distribution of d...