This paper examines several different approaches to exploiting structural information in semi-structured document categorization. The methods under consideration are designed for ...
Semi-structured documents (e.g. journal art,icles, electronic mail, television programs, mail order catalogs, ...) a.re often not explicitly typed; the only available t,ype inform...
The semi-structured information available in HTML and similar documents provide valuable information that can be used for information extraction applications. This information tog...
The increase in the use of XML (eXtensible Markup Language) makes the semistructured data more and more important on the Web. To exploit the full power of XML documents, a query l...
Semi-structured data such as XML and HTML is attracting considerable attention. It is important to develop various kinds of data mining techniques that can handle semistructured d...