In this work we present the runtime architecture of the OMPi OpenMP compiler. OMPi is a source-to-source C translator featuring a portable, modular and extensible runtime system. ...
Giorgos Ch. Philos, Vassilios V. Dimakopoulos, Pan...
Identifying and inferring performances of a network topology is a well known problem. Achieving this by using only end-to-end measurements at the application level is a method kno...
We consider the problem of computing all Nash equilibria in bimatrix games (i.e., nonzero-sum two-player noncooperative games). Computing all Nash equilibria for large bimatrix ga...
This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These ...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
This paper presents a new approach for analyzing the performance of grid scheduling algorithms for tasks with dependencies. Finding the optimal procedures for DAG scheduling in Gr...
Multi–comparand associative processors are efficient in parallel processing of complex search problems that arise from many application areas including computational geometry, ...
ACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY Department of Computing and Information Sciences College of Engineer...
Parallel computing is notoriously challenging due to the difficulty in developing correct and efficient programs. With the arrival of multi-core processors for desktop systems, ...
Data prefetching has been considered an effective way to mask data access latency caused by cache misses and to bridge the performance gap between processor and memory. With hardw...