Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronization and large communication volume between computing nodes for this application, existing solvers are mostly intended to run on tightly-coupled parallel computing platforms with high message passing performance (e.g., 1–10 µs in message latency and 100–1000 Mbytes/sec in message throughput). In order to utilize platforms with slower message passing, this paper investigates techniques that can significantly reduce the application’s communication needs. In particular, we propose batch pivoting to make pivot selections in groups through speculative factorization, and thus substantially decrease the inter-processor synchronization granularity. We experimented with an MPI-based implementation on several message passing platforms. While the speculative batch pivoting provides no performance benefit and even ...