Sciweavers

118 search results - page 17 / 24
» Communication and Memory Optimal Parallel Data Cube Construc...
Sort
View
LCPC
2007
Springer
14 years 26 days ago
A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
Jairo Balart, Marc González, Xavier Martore...
LPNMR
2009
Springer
14 years 1 months ago
Application of ASP for Automatic Synthesis of Flexible Multiprocessor Systems from Parallel Programs
Configurable on chip multiprocessor systems combine advantages of task-level parallelism and the flexibility of field-programmable devices to customize architectures for paralle...
Harold Ishebabi, Philipp Mahr, Christophe Bobda, M...
CORR
2006
Springer
142views Education» more  CORR 2006»
13 years 6 months ago
Decentralized Erasure Codes for Distributed Networked Storage
We consider the problem of constructing an erasure code for storage over a network when the data sources are distributed. Specifically, we assume that there are n storage nodes wit...
Alexandros G. Dimakis, Vinod M. Prabhakaran, Kanna...
ISCA
2011
IEEE
238views Hardware» more  ISCA 2011»
12 years 10 months ago
Rebound: scalable checkpointing for coherent shared memory
As we move to large manycores, the hardware-based global checkpointing schemes that have been proposed for small shared-memory machines do not scale. Scalability barriers include ...
Rishi Agarwal, Pranav Garg, Josep Torrellas
ICS
2010
Tsinghua U.
13 years 11 months ago
Large-scale FFT on GPU clusters
A GPU cluster is a cluster equipped with GPU devices. Excellent acceleration is achievable for computation-intensive tasks (e.g. matrix multiplication and LINPACK) and bandwidth-i...
Yifeng Chen, Xiang Cui, Hong Mei