Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
We propose and test an objective criterion for evaluation of clustering performance: How well does a clustering algorithm run on unlabeled data aid a classification algorithm? The...
People cannot type as fast as they think, especially when faced with the constraints of mobile devices. There have been numerous approaches to solving this problem, including rese...
Kearns introduced the "statistical query" (SQ) model as a general method for producing learning algorithms which are robust against classification noise. We extend this ...
Web extraction systems attempt to use the immense amount of unlabeled text in the Web in order to create large lists of entities and relations. Unlike traditional IE methods, the ...