The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
In this paper, we propose the "Democratic Classifier", a simple, democracy-inspired patternbased classification algorithm that uses very short patterns for classificatio...
In this paper we present a novel strategy, DragPushing, for improving the performance of text classifiers. The strategy is generic and takes advantage of training errors to succes...
Songbo Tan, Xueqi Cheng, Moustafa Ghanem, Bin Wang...
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
There is an ever increasing number of electronic documents available today and the task of organizing and categorizing this ever growing corpus of electronic documents has become t...