The paper introduces a framework for clustering data objects in a similarity-based context. The aim is to cluster objects into a given number of classes without imposing a hard pa...
The family of threshold algorithm (i.e., TA) has been widely studied for efficiently computing top-k queries. TA uses a sort-merge framework that assumes data lists are pre-sorted...
Data mining algorithms have been the focus of much research recently. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-...
Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishn...
In the context of large databases, data preparation takes a greater importance : instances and explanatory attributes have to be carefully selected. In supervised learning, instanc...
Data services for the Grid have focussed so far primarily on virtualising access to distributed databases, and encapsulating file location. However, orchestration of services requ...
Andrew Woolf, Ray Cramer, Marta Gutierrez, Kerstin...