In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper...
Bryan Black, Brian Mueller, Stephanie Postal, Ryan...
System developments and research on parallel query processing have concentrated either on "Shared Everything" or "Shared Nothing" architectures so far. While t...
Many important parallel applications require multiple flows of control to run on a single processor. In this paper, we present a study of four flow-of-control mechanisms: proces...
In this paper we present four parallel algorithms to compute any group of eigenvalues and eigenvectors of a Toeplitz-plus-Hankel matrix. These algorithms parallelize a method that...
Regular distributions for storing dense matrices on parallel systems are not always used in practice. In many scientific applicati RUMMA) [1] to handle irregularly distributed mat...