There are a number of established products on the market for wrapping—semi-automatic navigation and extraction of data—from web pages. These solutions make use of the inherent...
This paper describes a parameter estimation method for multi-label classification that does not rely on approximate inference. It is known that multi-label classification involvin...
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Many approaches to Information Extraction (IE) have been proposed in literature capable of finding and extract specific facts in relatively unstructured documents. Their applicatio...