Sciweavers

309 search results - page 49 / 62
» Parallel Memory Architecture for Arbitrary Stride Accesses
Sort
View
IEEEPACT
2008
IEEE
14 years 1 months ago
Leveraging on-chip networks for data cache migration in chip multiprocessors
Recently, chip multiprocessors (CMPs) have arisen as the de facto design for modern high-performance processors, with increasing core counts. An important property of CMPs is that...
Noel Eisley, Li-Shiuan Peh, Li Shang
IPPS
1998
IEEE
13 years 11 months ago
Impact of Switch Design on the Application Performance of Cache-Coherent Multiprocessors
In this paper, the effect of switch design on the application performance of cache-coherent non-uniform memory access (CC-NUMA) multiprocessors is studied in detail. Wormhole rout...
Laxmi N. Bhuyan, Hu-Jun Wang, Ravi R. Iyer, Akhile...
ICS
2009
Tsinghua U.
14 years 2 months ago
Fast and scalable list ranking on the GPU
General purpose programming on the graphics processing units (GPGPU) has received a lot of attention in the parallel computing community as it promises to offer the highest perfo...
M. Suhail Rehman, Kishore Kothapalli, P. J. Naraya...
ICS
2010
Tsinghua U.
14 years 9 days ago
Small-ruleset regular expression matching on GPGPUs: quantitative performance analysis and optimization
We explore the intersection between an emerging class of architectures and a prominent workload: GPGPUs (General-Purpose Graphics Processing Units) and regular expression matching...
Jamin Naghmouchi, Daniele Paolo Scarpazza, Mladen ...
IPPS
2007
IEEE
14 years 1 months ago
Load Miss Prediction - Exploiting Power Performance Trade-offs
— Modern CPUs operate at GHz frequencies, but the latencies of memory accesses are still relatively large, in the order of hundreds of cycles. Deeper cache hierarchies with large...
Konrad Malkowski, Greg M. Link, Padma Raghavan, Ma...