With wide adoption of chip multiprocessors (CMPs) in modern computers, there is an increasing demand for large capacity main memory systems. The emerging PCM (Phase Change Memory) ...
We consider the problem of broadcasting a large message in a large scale distributed platform. The message must be sent from a source node, with the help of the receiving peers whi...
This paper is the first extensive performance study of a recently proposed parallel programming model, called Concurrent Collections (CnC). In CnC, the programmer expresses her co...
Automatic tuning has emerged as a solution to provide high-performance libraries for fast changing, increasingly complex computer architectures. We distinguish offline adaptation (...
Line-rate data traffic monitoring in high-speed networks is essential for network management. To satisfy the line-rate requirement, one can leverage multi-core architectures to par...
Patrick P. C. Lee, Tian Bu, Girish P. Chandranmeno...
The computational power provided by many-core graphics processing units (GPUs) has been exploited in many applications. The programming techniques currently employed on these GPUs...
Long Chen, Oreste Villa, Sriram Krishnamoorthy, Gu...
Abstract: Existing multicore systems already provide deep levels of thread parallelism. Hybrid programming models and composability of parallel libraries are very active areas of r...
Costin Iancu, Steven Hofmeyr, Filip Blagojevic, Yi...
Abstract--The lag of parallel programming models and languages behind the advance of heterogeneous many-core processors has left a gap between the computational capability of moder...
Abstract--The growing complexity in computer system hierarchies due to the increase in the number of cores per processor, levels of cache (some of them shared) and the number of pr...