We describe the design and implementation of a high performance cloud that we have used to archive, analyze and mine large distributed data sets. By a cloud, we mean an infrastruc...
In this paper, we discuss some of the lessons that we have learned working with the Hadoop and Sector/Sphere systems. Both of these systems are cloud-based systems designed to sup...
Applying Cloud computing techniques for analyzing large data sets has shown promise in many data-driven scientific applications. Our approach presented here is to use Cloud comput...
Kalpa Gunaratna, Paul Anderson, Ajith Ranabahu, Am...
In this paper, we investigate the use of data mining, in particular the text classification and co-training techniques, to identify more relevant passages based on a small set of...
Xiangji Huang, Yan Rui Huang, Miao Wen, Aijun An, ...
One of the classic data mining problems is discovery of frequent itemsets. This problem particularly attracts database community as it resembles traditional database querying. In t...
Maciej Zakrzewicz, Mikolaj Morzy, Marek Wojciechow...