In digital libraries semantic techniques are often deployed to reduce the expensive manual overhead for indexing documents, maintaining metadata, or caching for future search. However, using such techniques may cause a decrease in a collection’s quality due to their statistical nature. Since data quality is a major concern in digital libraries, it is important to be able to measure the (loss of) quality of metadata automatically generated by semantic techniques. In this paper we present a user study based on a typical semantic technique used for automatic metadata creation, namely taxonomies of author keywords and tag clouds. We observed experts assessing typical relations between keywords and documents over a small corpus in the field of chemistry. Based on the evaluation of this experiment, we focused on communalities between the experts’ perception and thus draw a first roadmap on how to evaluate semantic techniques by proposing some preliminary metrics.