Many current multimedia database management systems perform content-based retrieval of images by extracting the values of various features from every object stored in their system...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
Long-term search history contains rich information about a user's search preferences. In this paper, we study statistical language modeling based methods to mine contextual i...
Tagging systems such as del.icio.us and Diigo have become important ways for users to organize information gathered from the Web. However, despite their popularity among early ado...
Lichan Hong, Ed Huai-hsin Chi, Raluca Budiu, Peter...
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...