This paper focuses on the Cyclops64 computer architecture and presents an analytical model and performance simulation results for the preloading and loop unrolling approaches to op...
Yanwei Niu, Ziang Hu, Kenneth E. Barner, Guang R. ...
In this paper, we examine the trade-offs in performance and area due to customizing the datapath and instruction set architecture of a soft VLIW processor implemented in a high-den...
Mazen A. R. Saghir, Mohamad El-Majzoub, Patrick Ak...
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
This paper presents a new method,based on Markov chain analysis, to evaluate the performance of schedules of behavioral specifications. The proposed performance measure is the expe...
Abstract— Minimizing the energy cost and improving thermal performance of power-limited datacenters, deploying large computing clusters, are the key issues towards optimizing the...