Using a cluster of commodity Alpha processors we compare two software platforms based on Linux and Windows NT and intended to support intensive scientic computations. Networking a...
The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable in...
In this paper, we consider alternate ways of storing a sparse matrix and their effect on computational speed. They involve keeping both the indices and the non-zero elements in t...
This paper focuses on non-strict processing, optimization, and partial evaluation of MPI programs which use incremental data structures (ISs). We describe the design and implement...
This paper proposes a source-level timing annotation method for generation of accurate transaction level models for software computation modules. While Transaction Level Modeling ...