Sciweavers

1156 search results - page 128 / 232
» Efficient Barriers for Distributed Shared Memory Computers
Sort
View
HPCA
2009
IEEE
14 years 10 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura
PPOPP
2006
ACM
14 years 4 months ago
McRT-STM: a high performance software transactional memory system for a multi-core runtime
Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed t...
Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hu...
IPPS
1996
IEEE
14 years 2 months ago
Kiloprocessor Extensions to SCI
To expand the Scalable Coherent Interface's (SCI) capabilities so it can be used to efficiently handle sharing in systems of hundreds or even thousands of processors, the SCI...
Stefanos Kaxiras
ICDE
2011
IEEE
262views Database» more  ICDE 2011»
13 years 1 months ago
Memory-constrained aggregate computation over data streams
— In this paper, we study the problem of efficiently computing multiple aggregation queries over a data stream. In order to share computation, prior proposals have suggested ins...
K. V. M. Naidu, Rajeev Rastogi, Scott Satkin, Anan...
EUROPAR
2005
Springer
14 years 3 months ago
A Paradigm for Parallel Matrix Algorithms:
A style for programming problems from matrix algebra is developed with a familiar example and new tools, yielding high performance with a couple of surprising exceptions. The under...
David S. Wise, Craig Citro, Joshua Hursey, Fang Li...