The ImageCLEF 2010 Photo Annotation Task poses the challenge of automated annotation of 93 visual concepts in Flickr photos. The participants were provided with a training set of 8,000 Flickr images including annotations, EXIF data and Flickr user tags. Testing was performed on 10,000 Flickr images, differentiated between approaches considering solely visual information, approaches relying on textual information and multi-modal approaches. Half of the ground truth was acquired with a crowdsourcing approach. The evaluation followed two evaluation paradigms: per concept and per example. In total, 17 research teams participated in the multi-label classification challenge with 63 submissions. Summarizing the results, the task could be solved with a MAP of 0.455 in the multi-modal configuration, with a MAP of 0.407 in the visual-only configuration and with a MAP of 0.234 in the textual configuration. For the evaluation per example, 0.66 F-ex and 0.66 OS-FCS could be achieved for the multi-m...
Stefanie Nowak, Mark J. Huiskes