The counterflow pipeline concept was originated by Sproull et al.[1] to demonstrate the concept of asynchronous circuits. This architecture relies on distributed decision making an...
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many imp...
The performance of a concurrent multithreaded architectural model, called superthreading 15 , is studied in this paper. It tries to integrate optimizing compilation techniques and...
Jenn-Yuan Tsai, Zhenzhen Jiang, Eric Ness, Pen-Chu...
As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that microprocessors will exploit having multiple parallel threads. To achieve t...
This paper examines the performance benefits of employing multicast communication and application-level multithreading in the Brazos software distributed shared memory (DSM) syste...
In this paper we introduce a page-based Lazy Release Consistency protocol called ADSM that constantly and efficiently adapts to the applications' sharing patterns. Adaptation...
The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective...
Good network hardware performance is often squandered by overheads for accessing the network interface (NI) within a host. NIs that support user-level messaging avoid frequent ope...
We propose and evaluate two complementary techniques to protect and virtualize a tightly-coupled network interface in a multicomputer. The techniques allow efficient, direct appli...
Kenneth Mackenzie, John Kubiatowicz, Matthew Frank...
Parallel computing on clusters of workstations is attractive because of the low costs in comparison to MPPs, but the speed of the local area network limits the class of applicatio...
Koen Langendoen, Rutger F. H. Hofman, Henri E. Bal