Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
Image classification is a well-studied and hard problem in computer vision. We extend a proven solution for classifying web spam to handle images. We exploit the link structure of...
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
This paper explores topic aspect (i.e., subtopic or facet) classification for English and Chinese collections. The evaluation model assumes a bilingual user who has found document...
A taxonomy organizes concepts or topics in a hierarchical structure and can be created manually or via automated systems. A major drawback of taxonomies is that they require users...
Xiaoguang Qi, Dawei Yin, Zhenzhen Xue, Brian D. Da...