Search Sciweavers | Sciweavers

1998 search results - page 81 / 400

» A Hardware Implementation of PRAM and Its Performance Evalua...

173

click to vote

CLUSTER
2011
IEEE

193views Distributed And Parallel Com...» more CLUSTER 2011»

Performance Characterization and Optimization of Atomic Operations on AMD GPUs

14 years 5 months ago

Download synergy.cs.vt.edu

—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...

Marwa Elteir, Heshan Lin, Wu-chun Feng

claim paper

Read More »

138

Voted

EUROPAR
2000
Springer

102views Distributed And Parallel Com...» more EUROPAR 2000»

Use of Performance Technology for the Management of Distributed Systems

15 years 9 months ago

Download www.c3.lanl.gov

This paper describes a toolset, PACE, that provides detailed predictive performance information throughout the implementation and execution stages of an application. It is structur...

Darren J. Kerbyson, John S. Harper, Efstathios Pap...

claim paper

Read More »

174

click to vote

PDP
2009
IEEE

117views Distributed And Parallel Com...» more PDP 2009»

High Throughput Intra-Node MPI Communication with Open-MX

16 years 18 days ago

Download hal.inria.fr

Abstract—The increasing number of cores per node in highperformance computing requires an efﬁcient intra-node MPI communication subsystem. Most existing MPI implementations rel...

Brice Goglin

claim paper

Read More »

140

click to vote

ISLPED
1999
ACM

150views Hardware» more ISLPED 1999»

Using dynamic cache management techniques to reduce energy in a high-performance processor

15 years 10 months ago

Download www.cs.york.ac.uk

In this paper, we propose a technique that uses an additional mini cache, the L0-Cache, located between the instruction cache I-Cache and the CPU core. This mechanism can provid...

Nikolaos Bellas, Ibrahim N. Hajj, Constantine D. P...

claim paper

Read More »

126

click to vote

IPPS
2002
IEEE

123views Distributed And Parallel Com...» more IPPS 2002»

Efficient Pipelining of Nested Loops: Unroll-and-Squash

15 years 10 months ago

Download www.coe.uncc.edu

The size and complexity of current custom VLSI have forced the use of high-level programming languages to describe hardware, and compiler and synthesis technology bstract designs ...

Darin Petkov, Randolph E. Harr, Saman P. Amarasing...

claim paper

Read More »

« Prev « First page 81 / 400 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers