Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelined Parallelism