The Web has become the world’s largest information source. Unfortunately, the main success factor of the Web, the inherent principle of distribution and autonomy of the participa...
The automatic extraction of information from unstructured sources has opened up new avenues for querying, organizing, and analyzing data by drawing upon the clean semantics of str...
We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical fo...
urgent need to promote Chinese in this paper we will raise the significance of keyword extraction using a new PAT-treebased approach, which is efficient in automatic keyword extra...
This paper outlines our approach to the creation of annotated corpora for the purposes of Web Information Extraction, and presents the Web Annotation tool. This tool enables the a...