A fundamental problem in data management is to draw and maintain a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With la...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
K-means is a widely used partitional clustering method. While there are considerable research efforts to characterize the key features of K-means clustering, further investigation...
We propose a novel cost-efficient approach to threshold selection for binary web-page classification problems with imbalanced class distributions. In many binary-classification ta...
The established host-centric networking paradigm is challenged due to handicaps related with disconnected operation, mobility, and broken locator/identifier semantics. This paper...
This paper presents a two-part study on managing distributed NUCA (Non-Uniform Cache Architecture) L2 caches in a future manycore processor to obtain high singlethread program per...