Abstract Identifier attributes--very high-dimensional categorical attributes such as particular product ids or people's names--rarely are incorporated in statistical modeling....
Abstract. Frequent itemsets and association rules are generally accepted concepts in analyzing item-based databases. The Apriori-framework was developed for analyzing categorical d...
Clustering is an important data mining problem. However, most earlier work on clustering focused on numeric attributes which have a natural ordering to their attribute values. Rec...
To find the optimal branching of a nominal attribute at a node in an L-ary decision tree, one is often forced to search over all possible L-ary partitions for the one that yields t...
Don Coppersmith, Se June Hong, Jonathan R. M. Hosk...
Abstract. In supervised learning, discretization of the continuous explanatory attributes enhances the accuracy of decision tree induction algorithms and naive Bayes classifier. M...