Early versus late fusion in semantic video analysis

16 years 5 days ago

Download staff.science.uva.nl

Semantic analysis of multimodal video aims to index segments of interest at a conceptual level. In reaching this goal, it requires an analysis of several information streams. At some point in the analysis these streams need to be fused. In this paper, we consider two classes of fusion schemes, namely early fusion and late fusion. The former fuses modalities in feature space, the latter fuses modalities in semantic space. We show by experiment on 184 hours of broadcast video data and for 20 semantic concepts, that late fusion tends to give slightly better performance for most concepts. However, for those concepts where early fusion performs better the diﬀerence is more signiﬁcant. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing—Indexing methods General Terms Algorithms, Performance Keywords Multimedia understanding, early fusion, late fusion, semantic concept detection

Cees Snoek, Marcel Worring, Arnold W. M. Smeulders

Real-time Traffic

Early Fusion | Late Fusion | MM 2005 | Semantic |

claim paper

» The MediaMill TRECVID 2007 Semantic Video Search Engine

» Coherent bagof audio words model for efficient largescale video copy detection

Post Info
More Details (n/a)

Added	26 Jun 2010
Updated	26 Jun 2010
Type	Conference
Year	2005
Where	MM
Authors	Cees Snoek, Marcel Worring, Arnold W. M. Smeulders

Comments (0)

Sciweavers

Early versus late fusion in semantic video analysis

Early Fusion | Late Fusion | MM 2005 | Semantic |

Explore & Download

Productivity Tools

Sciweavers