Sciweavers

121 search results - page 24 / 25
» Load-Balanced Parallel Merge Sort on Distributed Memory Para...
Sort
View
ASAP
2007
IEEE
219views Hardware» more  ASAP 2007»
14 years 1 months ago
SIMD Vectorization of Histogram Functions
Existing SIMD extensions cannot efficiently vectorize the histogram function due to memory collisions. We propose two techniques to avoid this problem. In the first, a hierarchi...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
IEEEPACT
2003
IEEE
14 years 22 days ago
Using Software Logging to Support Multi-Version Buffering in Thread-Level Speculation
In Thread-Level Speculation (TLS), speculative tasks generate memory state that cannot simply be combined with the rest of the system because it is unsafe. One way to deal with th...
María Jesús Garzarán, Milos P...
IPPS
2010
IEEE
13 years 5 months ago
Inter-block GPU communication via fast barrier synchronization
The graphics processing unit (GPU) has evolved from a fixedfunction processor with programmable stages to a programmable processor with many fixed-function components that deliver...
Shucai Xiao, Wu-chun Feng
IPPS
2010
IEEE
13 years 5 months ago
Enhancing adaptive middleware for quantum chemistry applications with a database framework
Quantum chemistry applications such as the General Atomic and Molecular Electronic Structure System (GAMESS) that can execute on a complex peta-scale parallel computing environment...
Lakshminarasimhan Seshagiri, Meng-Shiou Wu, Masha ...
CLUSTER
2009
IEEE
13 years 11 months ago
24/7 Characterization of petascale I/O workloads
Abstract--Developing and tuning computational science applications to run on extreme scale systems are increasingly complicated processes. Challenges such as managing memory access...
Philip H. Carns, Robert Latham, Robert B. Ross, Ka...