Sciweavers

481 search results - page 49 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
EMSOFT
2005
Springer
14 years 2 months ago
Optimizing inter-processor data locality on embedded chip multiprocessors
Recent research in embedded computing indicates that packing multiple processor cores on the same die is an effective way of utilizing the ever-increasing number of transistors. T...
Guilin Chen, Mahmut T. Kandemir
TOCS
1998
114views more  TOCS 1998»
13 years 8 months ago
Performance Evaluation of the Orca Shared-Object System
Orca is a portable, object-based distributed shared memory system. This paper studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. T...
Henri E. Bal, Raoul Bhoedjang, Rutger F. H. Hofman...
HPCA
1996
IEEE
14 years 1 months ago
A Comparison of Entry Consistency and Lazy Release Consistency Implementations
This paper compares several implementations of entry consistency (EC) and lazy release consistency (LRC), two relaxed memory models in use with software distributed shared memory ...
Sarita V. Adve, Alan L. Cox, Sandhya Dwarkadas, Ra...
CF
2010
ACM
14 years 4 days ago
Hybrid parallel programming with MPI and unified parallel C
The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited ...
James Dinan, Pavan Balaji, Ewing L. Lusk, P. Saday...
CCGRID
2003
IEEE
14 years 14 days ago
Kernel Level Speculative DSM
Interprocess communication (IPC) is ubiquitous in today's computing world. One of the simplest mechanisms for IPC is shared memory. We present a system that enhances the Syst...
Cristian Tapus