We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
Traditional content-based image retrieval (CBIR) systems often fail to meet a user's need due to the `semantic gap' between the extracted features of the systems and the...
Clinical medical records contain a wealth of information, largely in free-text form. Means to extract structured information from free-text records is an important research endeav...
Xiaohua Zhou, Hyoil Han, Isaac Chankai, Ann Prestr...
An approach for mining repositories of web-based user documentation for patterns of evolutionary change in the context of internationalization and localization is presented. Sets ...
Although Web search engines have become information gateways to the Internet, for queries containing technical terms, search results often contain pages that are difficult to be ...