In this paper we learn a dissimilarity measure for categorical data, for effective classification of the data points. Each categorical feature (with values taken from a finite set...
Jierui Xie, Boleslaw K. Szymanski, Mohammed J. Zak...
In this paper, we propose to use database technology to improve performance of web proxy servers. We view the cache at a proxy server as a web warehouse with data organized in a h...
By far, the support vector machines (SVM) achieve the state-of-theart performance for the text classification (TC) tasks. Due to the complexity of the TC problems, it becomes a ch...
Automatic categorization of videos in a Web-scale unconstrained collection such as YouTube is a challenging task. A key issue is how to build an effective training set in the pres...
Zheshen Wang, Ming Zhao, Yang Song, Sanjiv Kumar, ...
Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hie...