We have developed a threaded parallel data streaming approach using Logistical Networking (LN) to transfer multi-terabyte simulation data from computers at NERSC to our local anal...
Viraj Bhat, Scott Klasky, Scott Atchley, Micah Bec...
Many recommendation systems suggest items to users by utilizing the techniques of collaborative filtering (CF) based on historical records of items that the users have viewed, purc...
Yunhong Zhou, Dennis M. Wilkinson, Robert Schreibe...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie subtitles. To create the corpus, we propose an algorithm based on Gale and Churc...
Large web or e-commerce sites are frequently hosted on clusters. Successful open-source tools exist for clustering the front tiers of such sites (web servers and application serve...
The Web contains a large amount of documents and increasingly, also semantic data in the form of RDF triples. Many of these triples are annotations that are associated with docume...