Efficient loop scheduling on parallel and distributed systems depends mostly on load balancing, especially on heterogeneous PC-based cluster and grid computing environments. In this paper, a general approach, named Performance-Based Parallel Loop Self-Scheduling (PPLSS), was given to partition workload according to performance of grid nodes. This approach was applied to three types of application programs, which were executed on a testbed grid. Experimental results showed that our approach could execute efficiently for most scheduling parameters when estimation of node performance was accurate.