Abstract. The performance of shared-memory (OpenMP) implementations of three different PDE solver kernels representing finite difference methods, finite volume methods, and spectral methods has been investigated. The experiments have been performed on a self-optimizing NUMA system, the Sun Orange prototype, using different data placement and thread scheduling strategies. The results show that correct data placement is very important for the performance for all solvers. However, the Orange system has a unique capababilty of automatically changing the data distribution at run time through both migration and replication of data. For reasonable large PDE problems, we find that the time to do this is negligible compared to the total solve time. Also, the performance after the migration and replication process has reached steady-state is equal to what is achieved if data is optimally placed at the beginning of the execution using hand tuning. This shows that, for the application studied, the...