We describe an infrastructure for the collection and management of large amounts of text, and discuss the possibility of information extraction and visualisation from text corpora...
The two most important tasks in information extraction from the Web are webpage structure understanding and natural language sentences processing. However, little work has been don...
Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-R...
: Information retrieval tries to identify relevant documents for an information need. The problems that an IR system should deal with include document indexing (which tries to extr...
Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
A seed-based framework for textual information extraction allows for weakly supervised acquisition of open-domain class attributes over conceptual hierarchies, from a combination ...