—Partitioned global address space (PGAS) languages, such as Unified Parallel C (UPC) have the promise of being productive. Due to the shared address space view that they provide, they make distributing data and operating on ghost zones relatively easy. Meanwhile, they provide thread-data affinity that can enable locality exploitation. In this paper, we are considering sparse matrix multiplication which is an important operation for many scientific and engineering applications. Recently, several different high-performance algorithms and libraries have been developed for that operation. However, in this work, we were able to take advantage of one of the advanced features provided by UPC, which is the fact that it is a globally addressable memory model. Due to that feature, using UPC in this operation would enable threads to read or write data from any other thread’s memory directly without any inter-process communication as the case with MPI. Our goal is to evaluate the performance o...