This paper describes an approach to the implementation and the operation of a Simultaneous Multithreaded processor. We propose an architecture which integrates a software mechanism...
Processor cycle time continues to decrease faster than main memory access times, placing higher demands on cache memory hierarchy performance. To meet these demands, conventional ...
Alvin R. Lebeck, David R. Raymond, Chia-Lin Yang, ...
This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Du...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
Abstract. The High Performance Fortran (HPF) benchmark suite HPFBench was designed for evaluating the HPF language and compilers on scalable architectures. The functionality of the...
Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative executio...
Growing demand for high performance in embedded systems is creating new opportunities for Instruction-Level Parallelism ILP techniques that are traditionally used in high perform...
Daniel A. Connors, Jean-Michel Puiatti, David I. A...
Abstract. This paper presents DAOS, a model for exploitation of Andand Or-parallelism in logic programs. DAOS assumes a physically distributed memory environment and a logically sh...
This paper reports the parallel implementation of adaptive mesh re nement within nite di erence ocean circulation models. The implementation is based on the model of MalleableTasks...
This paper deals with the influence of hardware aspects of modern computer architectures to the design of software for numerical simulations. We present performance tests for vari...