An analysis is presented of the primary factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on distributedme...
In the study of PetaFlop project, Processor-In-Memory array was proposed to be a target architecture in achieving 1015 floating point operations per second computing performance. ...
Yi Tian, Edwin Hsing-Mean Sha, Chantana Chantrapor...
A heterogeneous computing system provides a variety of different machines, orchestrated to perform an application whose subtasks have diverse execution requirements. The subtasks ...
Abstract. We discuss a new optimisation for recursive functions yielding multiple results in tuples for lazy functional languages, like Clean and Haskell. This optimisation improve...
In this paper we present co-transformation, a novel approach to the mapping of execution information from the source code of a program to the object code for the purpose of worst-...
Previous timing analysis methods have assumed that the worst-case instruction execution time necessarily corresponds to the worst-case behavior. We show that this assumption is wr...
In this paper we present a technique for Worst-Case Execution Time WCET analysis for pipelined processors. Our technique uses a standard simulator instead of special-purpose pipel...
To minimize the execution time of an iterative application in a heterogeneous parallel computing environment, an appropriate mapping scheme is needed for matching and scheduling th...
Yu-Kwong Kwok, Anthony A. Maciejewski, Howard Jay ...
The bigraph crossing problem, embedding the two node sets of a bipartite graph G = V0;V1;E along two parallel lines so that edge crossings are minimized, has application to placeme...
Matthias F. M. Stallmann, Franc Brglez, Debabrata ...
The system of equations that govern kinematically redundant manipulators is commonly solved by nding the singular value decomposition (SVD) of the corresponding Jacobian matrix. T...
Tracy D. Braun, Anthony A. Maciejewski, Howard Jay...