Parallel performance wizard: A performance analysis tool for partitioned global-address-space programming