Data parallelism in bioinformatics workflows using Hydra

15 years 8 months ago

Download salsahpc.indiana.edu

Large scale bioinformatics experiments are usually composed by a set of data flows generated by a chain of activities (programs or services) that may be modeled as scientific workflows. Current Scientific Workflow Management Systems (SWfMS) are used to orchestrate these workflows to control and monitor the whole execution. It is very common in bioinformatics experiments to process very large datasets. In this way, data parallelism is a common approach used to increase performance and reduce overall execution time. However, most of current SWfMS still lack on supporting parallel executions in high performance computing (HPC) environments. Additionally keeping track of provenance data in distributed environments is still an open, yet important problem. Recently, Hydra middleware was proposed to bridge the gap between the SWfMS and the HPC environment, by providing a transparent way for scientists to parallelize workflow executions while capturing distributed provenance. This paper analy...

Fábio Coutinho, Eduardo S. Ogasawara, Danie

Real-time Traffic

Data Parallelism | Distributed And Parallel Computing | HPDC 2010 | Scientific Workflows | Workflow |

claim paper

» GPFlow An Intuitive Environment for Web Based Scientific Workflow

» BioWMS a webbased Workflow Management System for bioinformatics

» BioWorkflows with BizTalk Using a Commercial Workflow Engine for eScience

» A semantic gridbased data access and integration service for bioinformatics

» Biowep a workflow enactment portal for bioinformatics applications

» Improving the performance of speculatively parallel applications on the Hydra CMP

» A lightweight flowbased toolkit for parallel and distributed bioinformatics pipelines

» Using the reconfigurable massively parallel architecture COPACOBANA 5000 for applications ...

Post Info
More Details (n/a)

Added	09 Nov 2010
Updated	09 Nov 2010
Type	Conference
Year	2010
Where	HPDC
Authors	Fábio Coutinho, Eduardo S. Ogasawara, Daniel de Oliveira, Vanessa P. Braganholo, Alexandre A. B. Lima, Alberto M. R. Dávila, Marta Mattoso

Comments (0)

Sciweavers

Data parallelism in bioinformatics workflows using Hydra

Data Parallelism | Distributed And Parallel Computing | HPDC 2010 | Scientific Workflows | Workflow |

Explore & Download

Productivity Tools

Sciweavers