—Image population analysis is the class of statistical methods that plays a central role in understanding the development, evolution and disease of a population. However, these t...
Abstract. This paper presents an innovative architecture for a reconfigurable device that allows single cycle context switching and single cycle random access to the unified on-chi...
Reetinder P. S. Sidhu, Sameer Wadhwa, Alessandro M...
Per-core local (scratchpad) memories allow direct inter-core communication, with latency and energy advantages over coherent cache-based communication, especially as CMP architect...
Stamatis G. Kavadias, Manolis Katevenis, Michail Z...
—This paper proposes a parallelization scheme for parameter sweep (PS) applications using the compute unified device architecture (CUDA). Our scheme focuses on PS applications w...
Latency tolerance is essential in achieving high performance on parallel computers for remote function calls and fine-grained remote memory accesses. EM-X supports interprocessor ...