Abstract. Previous implementations of MPICH using the Cray SHMEM interface existed for the Cray T3 series of machines, but these implementations were abandoned after the T3 series was discontinued. However, support for the Cray SHMEM programming interface has continued on other platforms, including commodity clusters built using the Quadrics QsNet network. In this paper, we describe a design for MPI that overcomes some of the limitations of the previous implementations. We compare the performance of the SHMEM MPI implementation with the native implementation for Quadrics QsNet. Results show that our implementation is faster for certain message sizes for some micro-benchmarks.