Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper, we explore and study social annotations and tagging with regard to their usefulness for web document classification by an analysis of large sets of real-world data. We are interested in finding out which kinds of documents are annotated more by end users than others, how users tend to annotate these documents, and in particular how this user-generated folksonomy compares with a top-down taxonomy maintained by classification experts for the same set of documents. We describe what can be deduced from the results for further research and development in the areas of document classification and information retrieval. Categories and Subject Descriptors I.7.4 [Document and Text Processing]: Electronic Publishing; I.7.1 [Document and Text Processing]: Document and Text Editing--Document Management; H.3.3 [Informatio...
Michael G. Noll, Christoph Meinel