With the exponential growth of complete genome sequences, the analysis of these sequences is becoming a powerful approach to build genome-scale metabolic models. These models can be used to study molecular components, their activities and relationships, and thus achieve the goal of studying cells as systems. However, constructing genome-scale metabolic models manually is timeconsuming and labor-intensive. This property of manual modelbuilding process causes the fact that much fewer genome-scale metabolic models are available comparing to hundreds of genome sequences available. To tackle this problem, we design SWARM, a scientific workflow that can be utilized to improve genomescale metabolic models in high-throughput fashion. SWARM deals with a range of issues including the integration of data across distributed resources, data format conversions, data update, and data provenance. Putting altogether, SWARM streamlines the whole modeling process that includes extracting data from vario...