In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
— Ordinal regression is an important type of learning, which has properties of both classification and regression. Here we describe an effective approach to adapt a traditional ...
— Category Ranking is a variant of the multi-label classification problem, in which, rather than performing a (hard) assignment to an object of categories from a predefined set...
The k-means algorithm is a popular clustering method used in many different fields of computer science, such as data mining, machine learning and information retrieval. However, ...
This paper describes the building of a research library for studying the Web, especially research on how the structure and content of the Web change over time. The library is part...
William Y. Arms, Selcuk Aya, Pavel Dmitriev, Blaze...