Abstract— As the datasets used to fuel modern scientific discovery grow increasingly large, they become increasingly difficult to manage using conventional software. Parallel d...
Sarah Loebman, Dylan Nunley, YongChul Kwon, Bill H...
Abstract. In this paper we describe a methodology for harvesting information from large distributed repositories (e.g. large Web sites) with minimum user intervention. The methodol...
Fabio Ciravegna, Sam Chapman, Alexiei Dingli, Yori...
Design of cluster file system is very important for building a general-purpose cluster with commodity components. To provide scalable high I/O performance needed in the scientific...
Model checking based on validating temporal logic formulas has proven practical and effective for numerous software engineering applications. As systems based on this approach ha...
Text classification using a small labeled set and a large unlabeled data is seen as a promising technique to reduce the labor-intensive and time consuming effort of labeling traini...