Background: This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields diff...
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal...
The problem of assessing the significance of data mining results on high-dimensional 0?1 data sets has been studied extensively in the literature. For problems such as mining freq...
Aristides Gionis, Heikki Mannila, Panayiotis Tsapa...
By analogy with merging documents rankings, the outputs from multiple search results clustering algorithms can be combined into a single output. In this paper we study the feasibi...