Extracting information from web pages is an important problem; it has several applications such as providing improved search results and construction of databases to serve user qu...
Paramveer S. Dhillon, Sundararajan Sellamanickam, ...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
The Web is mainly processed by humans. The role of the machines is just to transmit and display the contents of the documents, barely being able to do something else. Nowadays ther...
Recognizing that information from different sources refers to the same (real world) entity is a crucial challenge in instance-level information integration, as it is a pre-requisi...
Paolo Bouquet, Heiko Stoermer, Claudia Nieder&eacu...
In this poster, we present an information extraction engine for web-based forums. The engine analyzes the HTML files crawled from web forums, deduces the wrapper (template) of the...
Hanny Yulius Limanto, Nguyen Ngoc Giang, Vo Tan Tr...