In this paper, we report the development and experiments of IBM Content Harvester (CH), a tool to analyze and recover templates and content from word processor created text docume...
Automated extraction of bibliographic information from journal articles is key to the affordable creation and maintenance of citation databases, such as MEDLINE
Xiaoli Zhang, Jie Zou, Daniel X. Le, George R. Tho...
Both Geographic Information Systems and Information Retrieval have been very active research fields in the last decades. Lately, a new research field called Geographic Informati...
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and ...
This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in inform...
Walid Magdy, Jinming Min, Johannes Leveling, Garet...