Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Many parallel scientific applications need high-performance I/O. Unfortunately, end-to-end parallel-I/O performance has not been able to keep up with substantial improvements in p...
The routing architecture of the original 4.4BSD [3] kernel has been deployed successfully without major design modification for over 15 years. In the unified routing architectur...
A number of experiments regarding the placement of instructions, private data and shared data in the Non-Uniform-Memory-Access multiprocessor, RP3 has been performed. Three Scient...
We present an active switch architecture to improve the performance of systems connected via system area networks. Our programmable active switches not only flexibly route packets...