The Message Passing Interface (MPI) is a standard in parallel computing, and can also be used as a highperformance programming model for Grid application development. How to execute MPI applications efficiently over a computational Grid has become a big challenge to developers, due to the distributed nature of Grid resources and complex hierarchies of Grid links. In this paper, we present three useful techniques for improving the performance of MPI applications over a computational Grid. We introduce the multithreaded model to the implementation of MPI point-topoint operations, to overlap communication with computation and speed up point-to-point operations. To enable the porting of MPI applications to a Grid composed of multiple private-IP clusters, a crosssubnet communication mechanism based on NAT has been designed. To improve the performance of MPI collective operations over a computational Grid, we implements a kind of topology-aware collective communication algorithms based on a...