In distributed-memory message-passing architectures reducing communication cost is extremely important. In this paper, we present a technique to optimize communication globally. Our approach is based on a combination of linear algebra framework and dataflow analysis, and can take arbitrary control flow into account. The distinctive features of the algorithm are its accuracy in keeping communication set information and its support for general alignments and distributions including block-cyclic distributions. The method is currently being implemented in the PARADIGM compiler. The preliminary results show that the technique is effective in reducing both number as well as volume of the communication.
Mahmut T. Kandemir, Prithviraj Banerjee, Alok N. C