iFlow is a replication-based system that can achieve both fast and reliable processing of high volume data streams on the Internet scale. iFlow uses a low degree of replication in...
The Open Science Data Cloud is a distributed cloud based infrastructure for managing, analyzing, archiving and sharing scientific datasets. We introduce the Open Science Data Clou...
Robert L. Grossman, Yunhong Gu, Joe Mambretti, Mic...
Data is routinely created, disseminated, and processed in distributed systems that span multiple administrative domains. To maintain accountability while the data is transformed b...
Current projects that automate the collection of provenance information use a centralized architecture for managing the resulting metadata - that is, provenance is gathered at rem...
In preparation for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report, the climate community will run the Coupled Model Intercomparison Project phase 5 (...
Rajkumar Kettimuthu, Alex Sim, Dan Gunter, Bill Al...
We introduce LogGOPSim--a fast simulation framework for parallel algorithms at large-scale. LogGOPSim utilizes a slightly extended version of the well-known LogGPS model in combin...
The Semantic Web consists of many billions of statements made of terms that are either URIs or literals. Since these terms usually consist of long sequences of characters, an effe...
Visualization of large-scale high dimensional data tool is highly valuable for scientific discovery in many fields. We present PubChemBrowse, a customized visualization tool for c...
Jong Youl Choi, Seung-Hee Bae, Judy Qiu, Geoffrey ...