Existing SIMD extensions cannot efficiently vectorize the histogram function due to memory collisions. We propose two techniques to avoid this problem. In the first, a hierarchi...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
In Thread-Level Speculation (TLS), speculative tasks generate memory state that cannot simply be combined with the rest of the system because it is unsafe. One way to deal with th...
The graphics processing unit (GPU) has evolved from a fixedfunction processor with programmable stages to a programmable processor with many fixed-function components that deliver...
Quantum chemistry applications such as the General Atomic and Molecular Electronic Structure System (GAMESS) that can execute on a complex peta-scale parallel computing environment...
Abstract--Developing and tuning computational science applications to run on extreme scale systems are increasingly complicated processes. Challenges such as managing memory access...
Philip H. Carns, Robert Latham, Robert B. Ross, Ka...