We present parallel and sequential dense QR factorization algorithms that are optimized to avoid communication. Some of these are novel, and some extend earlier work. Communicatio...
James Demmel, Laura Grigori, Mark Hoemmen, Julien ...
Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Abstract. This paper presents an efficient parallel algorithm for volume rendering of large-scale datasets. Our algorithm focuses on an optimization technique, namely early ray te...
: The STI CELL processor introduces pioneering solutions in processor architecture. At the same time it presents new challenges for the development of numerical algorithms. One is ...
A rising horizon in chip fabrication is the 3D integration technology. It stacks two or more dies vertically with a dense, high-speed interface to increase the device density and ...
Xiuyi Zhou, Yi Xu, Yu Du, Youtao Zhang, Jun Yang 0...