For a wide variety of classification algorithms, scalability to large databases can be achieved by observing that most algorithms are driven by a set of sufficient statistics that...
In this paper, we present an algorithm that can classify large-scale text data with high classification quality and fast training speed. Our method is based on a novel extension o...
Dong Zhuang, Benyu Zhang, Qiang Yang, Jun Yan, Zhe...
The problem of document classification considers categorizing or grouping of various document types. Each document can be represented as a bag of words, which has no straightforw...
Organizations today collect and store large amounts of data in various formats and locations. However they are sometimes required to locate all instances of a certain type of data....
This paper investigates a brute-force technique for mining classification rules from large data sets. We employ an association rule miner enhanced with new pruning strategies to c...