Sciweavers

403 search results - page 15 / 81
» On Using Incremental Profiling for the Performance Analysis ...
Sort
View
PAAPP
2006
141views more  PAAPP 2006»
13 years 8 months ago
Algorithmic optimizations of a conjugate gradient solver on shared memory architectures
OpenMP is an architecture-independent language for programming in the shared memory model. OpenMP is designed to be simple and in terms of programming abstractions. Unfortunately,...
Henrik Löf, Jarmo Rantakokko
IPPS
2000
IEEE
14 years 1 months ago
Controlling Distributed Shared Memory Consistency from High Level Programming Languages
One of the keys for the success of parallel processing is the availability of high-level programming languages for on-the-shelf parallel architectures. Using explicit message passi...
Yvon Jégou
ICCD
2006
IEEE
115views Hardware» more  ICCD 2006»
14 years 5 months ago
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
—This paper introduces the microarchitecture and logical implementation of SMT (Simultaneous Multithreading) improvement of Godson-2 processor which is a 64-bit, four-issue, out-...
Zusong Li, Xianchao Xu, Weiwu Hu, Zhimin Tang
IPPS
2005
IEEE
14 years 2 months ago
Power and Energy Profiling of Scientific Applications on Distributed Systems
Power consumption is a troublesome design constraint for emergent systems such as IBM’s BlueGene /L. If current trends continue, future petaflop systems will require 100 megawat...
Xizhou Feng, Rong Ge, Kirk W. Cameron
COMPGEOM
2006
ACM
14 years 2 months ago
Engineering a compact parallel delaunay algorithm in 3D
We describe an implementation of a compact parallel algorithm for 3D Delaunay tetrahedralization on a 64-processor shared-memory machine. Our algorithm uses a concurrent version o...
Daniel K. Blandford, Guy E. Blelloch, Clemens Kado...