1 This paper defines a new stacked generalization framework in the context of information extraction (IE) from online sources. The proposed setting removes the constraint of apply...
The standard method for combating spam, either in email or on the web, is to train a classifier on manually labeled instances. As the spammers change their tactics, the performanc...
Deepak Chinavle, Pranam Kolari, Tim Oates, Tim Fin...
Ambiguities in company names are omnipresent. This is not accidental, companies deliberately chose ambiguous brand names, as part of their marketing and branding strategy. This pro...
When dealing with information overload from the Internet, such as the classification of Web pages and the filtering of email spam, a new technique called cotraining has been shown...
. When publishing documents on the web, the user needs to describe and classify her documents for the benefit of later retrieval and use. This paper presents an approach to semanti...