Extracting information from web pages is an important problem; it has several applications such as providing improved search results and construction of databases to serve user qu...
Paramveer S. Dhillon, Sundararajan Sellamanickam, ...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
Enterprises provide professionally authored content about their products/services in different languages for use in web sites and customer care. For customer care, personalization...