In this paper, we report on our experience with the creation of an automated, human-assisted process to extract metadata from documents in a large (>100,000), dynamically growi...
Jianfeng Tang, Kurt Maly, Steven J. Zeil, Mohammad...
Despite recent advances in wireless and portable hardware technologies, mobile access to the Web is often laborious. For this reason, several solutions have been proposed to custom...
Leonardo Teixeira Passos, Marco Tulio de Oliveira ...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
When we talk about using neural networks for data mining we have in mind the original data mining scope and challenge. How did neural networks meet this challenge? Can we run neura...
Information Extraction (IE) from text /web documents has become an important application area of AI. As the number of web sites and documents has grown dramatically, the users need...