Sciweavers

501 search results - page 87 / 101
» Evaluating CMPs and Their Memory Architecture
Sort
View
ICCD
2004
IEEE
138views Hardware» more  ICCD 2004»
14 years 4 months ago
Design and Implementation of Scalable Low-Power Montgomery Multiplier
In this paper, an efficient Montgomery multiplier is introduced for the modular exponentiation operation, which is fundamental to numerous public-key cryptosystems. Four aspects a...
Hee-Kwan Son, Sang-Geun Oh
DATE
2009
IEEE
110views Hardware» more  DATE 2009»
14 years 2 months ago
Light NUCA: A proposal for bridging the inter-cache latency gap
Abstract—To deal with the “memory wall” problem, microprocessors include large secondary on-chip caches. But as these caches enlarge, they originate a new latency gap between...
Darío Suárez Gracia, Teresa Monreal,...
IEEEPACT
2008
IEEE
14 years 2 months ago
Mars: a MapReduce framework on graphics processors
We design and implement Mars, a MapReduce framework, on graphics processors (GPUs). MapReduce is a distributed programming framework originally proposed by Google for the ease of ...
Bingsheng He, Wenbin Fang, Qiong Luo, Naga K. Govi...
DAMON
2007
Springer
14 years 1 months ago
Vectorized data processing on the cell broadband engine
In this work, we research the suitability of the Cell Broadband Engine for database processing. We start by outlining the main architectural features of Cell and use microbenchmar...
Sándor Héman, Niels Nes, Marcin Zuko...
HIPC
2007
Springer
14 years 1 months ago
Optimization of Collective Communication in Intra-cell MPI
: The Cell is a heterogeneous multi-core processor, which has eight co-processors, called SPEs. The SPEs can access a common shared main memory through DMA, and each SPE can direct...
M. K. Velamati, Arun Kumar, Naresh Jayam, Ganapath...