Sciweavers

GPC
2007
Springer

Server-Side Parallel Data Reduction and Analysis

14 years 5 months ago
Server-Side Parallel Data Reduction and Analysis
Abstract. Geoscience analysis is currently limited by cumbersome access and manipulation of large datasets from remote sources. Due to their data-heavy and compute-light nature, these analysis workloads represent a class of applications unsuited to a computational grid optimized for compute-intensive applications. We present the Script Workflow Analysis for MultiProcessing (SWAMP) system, which relocates data-intensive workflows from scientists’ workstations to the hosting datacenters in order to reduce data transfer and exploit locality. Our colocation of computation and data leverages the typically reductive characteristics of these workflows, allowing SWAMP to complete workflows in a fraction of the time and with much less data transfer. We describe SWAMP’s implementation and interface, which is designed to leverage scientists’ existing script-based workflows. Tests with a production geoscience workflow show drastic improvements not only in overall execution time, but in...
Daniel L. Wang, Charles S. Zender, Stephen F. Jenk
Added 07 Jun 2010
Updated 07 Jun 2010
Type Conference
Year 2007
Where GPC
Authors Daniel L. Wang, Charles S. Zender, Stephen F. Jenks
Comments (0)