There is building interest in using FPGAs as accelerators for high-performance computing, but existing systems for programming them are so far inadequate. In this paper we propose...
—Static cache analysis for data allocated on the heap is practically impossible for standard data caches. We propose a distinct object cache for heap allocated data. The cache is...
A metascalable (or “design once, scale on new architectures”) parallel computing framework has been developed for large spatiotemporal-scale atomistic simulations of materials...
Ken-ichi Nomura, Richard Seymour, Weiqiang Wang, H...
Clusters are now composed of non-uniform nodes with different CPUs, disks or network cards so that customers can adapt the cluster configuration to the changing technologies and t...
Tobias Mayr, Philippe Bonnet, Johannes Gehrke, Pra...
In retargeting loop-based code for multimedia instruction set extensions, a critical issue is that vector data types of mixed precision within a loop body complicate the paralleli...