Data streaming applications, usually composed with sequential/parallel tasks in a data pipeline form, bring new challenges to task scheduling and resource allocation in grid environments. Due to high volumes of data and relatively limit storage capability, resource allocation and data streaming have to be storage aware. In this paper, Genetic Algorithm (GA) is adopted for task scheduling of pipelines, based on online measurement and prediction with Gray Model (GM). On-demand data streaming is introduced to avoid data overflow using repertory strategies. Experimental results show that balance among task executions with on-demand data streaming is required to improve overall performance, avoid system bottlenecks and backlogs of intermediate data, and increase data throughput of pipelines as a whole.