Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
Text Mining is one of the best solutions for today and the future’s information explosion. With the development of modern processor technologies, it will be a mass market deskto...
Information extraction (IE) — the problem of extracting structured information from unstructured text — has become an increasingly important topic in recent years. A SIGMOD 20...
Laura Chiticariu, Yunyao Li, Sriram Raghavan, Fred...
Even in a massive corpus such as the Web, a substantial fraction of extractions appear infrequently. This paper shows how to assess the correctness of sparse extractions by utiliz...
Open Information Extraction (IE) is the task of extracting assertions from massive corpora without requiring a pre-specified vocabulary. This paper shows that the output of state...