With the success of blogs as popular information sharing media, searches on blogs have become popular. In the blogosphere, tagging is used as a means of annotating blog entries with contextually meaningful keywords, which enable users more easily locate blog content. Yet, although tags provided by bloggers are effective for organizing blog entries, in many cases, they are not always sufficient in properly capturing the semantics of the blog content. In our previous work [7], we observed that there exists large degree of content overlap (not only in the form of quotation/commentary pairs, but also as content borrowing across media outlets) among blog entries, which makes it hard for effective, discriminating keyword searches. In this paper, we further note that these implicit or explicit quotations could be leveraged to identify the contexts in which entries occur; thus, resulting in more effective tagging. Thus, we propose CDIP (a collection-driven, yet individualitypreserving taggin...
Jong Wook Kim, K. Selçuk Candan, Jun'ichi T