Relation between Agreement Measures on Human Labeling and Machine Learning Performance: Results from an Art History Domain