This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
Many standardized hardware communication interfaces offer runtime flexibility and configurability at the cost of efficiency. An alternate approach is the use of a highly-effic...
Steve Ward, Karim Abdalla, Rajeev Dujari, Michael ...
—Data dependence analysis techniques are the main component of today’s strategies for automatic detection of parallelism. Parallelism detection strategies are being incorporate...
Mapping data to parallel computers aims at minimizing the execution time of the associated application. However, it can take an unacceptable amount of time in comparison with the ...
Nashat Mansour, Ravi Ponnusamy, Alok N. Choudhary,...
Shared-memory provides a uniform and attractive mechanism for communication. For efficiency, it is often implemented with a layer of interpretive hardware on top of a message-pas...
Performance monitoring of large scale parallel computers creates a dilemma: we need to collect detailed information to find performance bottlenecks, yet collecting all this data ...
This paper examines the effectiveness of decoupling as an optimization technique for high-performance computer architectures. Decoupled access execute architectures are described,...
Peter L. Bird, Alasdair Rawsthorne, Nigel P. Topha...
: The EM-4 is a supercomputer that offers very fast inter processor communication and support for multi threading. In this paper we demonstrate that the EM-4, Together with an auto...