Abstract. XML provides a natural mechanism for representing semistructured and unstructured data. It becomes the basis for encoding a large variety of information, for example, the...
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
A semi-structured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because eac...