Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to so...
Huge masses of digital data about products, customers and competitors have become available for companies in the services sector. In order to exploit its inherent (and often hidde...
We propose a new method for clustering based on finding maximum margin hyperplanes through data. By reformulating the problem in terms of the implied equivalence relation matrix, ...
Linli Xu, James Neufeld, Bryce Larson, Dale Schuur...
The growing availability of on-line textual sources and the potential number of applications of knowledge acquisition from textual data has lead to an increase in Information Extr...
In this paper, a novel automatic image annotation system is proposed, which integrates two sets of support vector machines (SVMs), namely the multiple instance learning (MIL)-base...