Combining multiple classifiers is of particular interest in multimedia applications. Each modality in multimedia data can be analyzed individually, and combining multiple pieces of evidence can usually improve classification accuracy. However, most combination strategies used in previous studies implement some ad hoc designs, and ignore the varying "expertise" of specialized individual modality classifiers in recognizing a category under particular circumstances. In this paper we present a combination framework called "metaclassification", which models the problem of combining classifiers as a classification problem itself. We apply the technique on a wearable "experience collection" system, which unobtrusively records the wearer's conversation, recognizes the face of the dialogue partner, and remember his/her voice. When the system sees the same person's face or hears the same voice, it can then use a summary of the last conversation to remind t...
Wei-Hao Lin, Alexander G. Hauptmann