Knowledge of relationships among categories is of the interest in different domains such as text classification, content analysis, and text mining. We propose and evaluate approaches to effectively identify relationships among document categories. Our proposed novel method capitalizes on the misclassification results of a text classifier to identify potential relationships among categories. We demonstrate that our system detects such relationships, even those relationships that assessors failed to identify in manual evaluation. Furthermore, we favorably compare the effectiveness of our methods with the state of art method and demonstrate a significant improvement in precision (34%) and recall (5%). Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval] Clustering General Terms Algorithms, Performance, Experimentation Keywords Category relationships, Classification, Clustering
Saket S. R. Mengle, Nazli Goharian, Alana Platt