This paper describes a comprehensive approach to construct robust multi-modal video classification on a specific digital source, broadcast news. Broadcast news has a very stable structure and every segment has its specific purpose. Video classification can support fundamental understanding of the structure of the video and the content. The variety of video content makes it hard to classify; however, it also provides multimodal information. Our approach tries to solve two important issues of multimodal classification. The first one is to select few discriminative features from many raw features and the second one is to efficiently combine multiple sources. We applied Fisher’s Linear Discriminant (FLD) for feature selection and concatenated the projections into a single synthesized feature vector as the combination strategy. Experimental results on the 2003 TRECVID [1] news video archive show that our approach achieves very robust and accurate performance. Categories and Subject Descr...
Ming-yu Chen, Alexander G. Hauptmann