Abstract. High performance computing has become a key step to introduce computer tools, like real-time registration, in the medical field. To achieve real-time processing, one usually simplifies and adapts algorithms so that they become application and data specific. This involves designing and programming work for each application, and reduces the generality and robustness of the method. Our goal in this paper is to show that a general registration algorithm can be parallelized on an inexpensive and standard parallel architecture with a mall amount of additional programming work, thus keeping intact the algorithm performance. For medical applications, we show that a cheap cluster of dual-processor PCs connected by an Ethernet network is a good trade-off between the power and the cost of the parallel platform. Portability, scalability and safety requirements led us to choose OpenMP to program multi-processor machines and MPI to coordinate the different nodes of the cluster. The resulti...