The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the application characteristics. On the other hand, the task of limiting the parallelism in a chosen parallel algorithm is best handled by the compiler or operating system for the target MPP machine. Toward this end, we have developed CASS (for Clustering And Scheduling System), a task management system that provides facilities for automatic granularity optimization and task scheduling of parallel programs on distributed memory parallel architectures. Our tool environment, CASS, consists of a twophase method of compiler-time scheduling, in which task clustering is performed prior to the actual scheduling process. The clustering module identifies the optimal number of processing nodes that the program will require to obtain maximum performance on the target parallel machine. The scheduling module maps the clusters onto a...
Jing-Chiou Liou, Michael A. Palis