Abstract. We present a novel multi-modal evidence fusion method for highlevel feature (HLF) detection in videos. The uni-modal features, such as color histogram, transcript texts, etc, tend to capture different aspects of HLFs and hence share complementariness and redundancy in modeling the contents of such HLFs. We argue that such inter-relation are key to effective multi-modal fusion. Here, we formulate the fusion as a multi-criteria group decision making task, in which the uni-modal detectors are coordinated for a consensus final detection decision, based on their inter-relations. Specifically, we mine the complementariness and redundancy inter-relation of uni-modal detectors using the Ordered Weighted Average (OWA) operator. The ‘or-ness’ measure in OWA models the inter-relation of uni-modal detectors as combination of pure complementariness and pure redundancy. The resulting weights of OWA can then yield a consensus fusion, by optimally leveraging the decisions of unimodal det...