EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...
Structure analysis of table form documents is an important issue because a printed document and even an electronic document do not provide logical structural information but merely...
Social annotation via so-called collaborative tagging describes the process by which many users add metadata in the form of unstructured keywords to shared content. In this paper,...
This paper presents a hybrid concept hierarchy development technique for web returned documents retrieved by a meta-search engine. The aim of the technique is to separate the init...
Razvan Stefan Bot, Yi-fang Brook Wu, Xin Chen, Qua...
It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...