This paper presents an extensive study about the evolution of textual content on the Web, which shows how some new pages are created from scratch while others are created using al...
Document-centric XML is a mixture of text and structure. With the increased availability of document-centric XML content comes a need for query facilities in which both structural...
Jaap Kamps, Maarten Marx, Maarten de Rijke, Bö...
The relevance of a web document could be measured not only by its text content, but also by some other factors such as the link connectivity, the usage pattern. In previous data f...
This paper presents a method for automatically annotating and retrieving animal images. Our model is a multi-modality ontology extended from our previous works in the sense that b...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...