Kernel summations are a ubiquitous key computational bottleneck in many data analysis methods. In this paper, we attempt to marry, for the first time, the best relevant technique...
Dongryeol Lee, Richard W. Vuduc, Alexander G. Gray
Abstract. This paper is motivated by the desire to provide an efficient and scalable software cache implementation of OpenMP on multicore and manycore architectures in general, and...
Chen Chen, Joseph B. Manzano, Ge Gan, Guang R. Gao...
We introduce a shared memory software prototype system for executing programs with nested parallelism on a network of workstations. This programming model exhibits a very convenie...
To provide high dependability in a multithreaded system despite hardware faults, the system must detect and correct errors in its shared memory system. Recent research has explore...
The Data Diffusion Machine is a scalable virtual shared memory architecture. A hierarchical network is used to ensure that all data can be located in a time bounded by O(logp), wh...
Henk L. Muller, Paul W. A. Stallard, David H. D. W...