:. In designing application-specific bit-level architectures and in programming existing bit-level processor arrays, it is necessary to expand a word-level algorithm into its bit-...
Current operating systems offer basic support for network interface controllers (NICs) supporting remote direct memory access (RDMA). Such support typically consists of a device d...
This paper evaluates the use of per-node multi-threading to hide remote memory and synchronization latencies in a software DSM. As with hardware systems, multi-threading in softwa...
—Partitioned global address space (PGAS) languages, such as Unified Parallel C (UPC) have the promise of being productive. Due to the shared address space view that they provide,...
Researchers have recently proposed coupling small- to mediumscale multiprocessors to build large-scale shared memory machines, known as multigrain shared memory systems. Multigrai...