For complex queries in parallel database systems, substantial amounts of data must be redistributed between operators executed on different processing nodes. Frequently, such inter...
This paper describes an approach to the implementation and the operation of a Simultaneous Multithreaded processor. We propose an architecture which integrates a software mechanism...
Processor cycle time continues to decrease faster than main memory access times, placing higher demands on cache memory hierarchy performance. To meet these demands, conventional ...
Alvin R. Lebeck, David R. Raymond, Chia-Lin Yang, ...
This paper describes a tiling technique that can be used by application programmers and optimizing compilers to obtain I/O-efficient versions of regular scientific loop nests. Du...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
Abstract. The High Performance Fortran (HPF) benchmark suite HPFBench was designed for evaluating the HPF language and compilers on scalable architectures. The functionality of the...
Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative executio...
Growing demand for high performance in embedded systems is creating new opportunities for Instruction-Level Parallelism ILP techniques that are traditionally used in high perform...
Daniel A. Connors, Jean-Michel Puiatti, David I. A...
Abstract. This paper presents DAOS, a model for exploitation of Andand Or-parallelism in logic programs. DAOS assumes a physically distributed memory environment and a logically sh...