IP forwarding is one of the main bottlenecks in Internet backbone routers, as it requires performing the longest-prefix match at 10Gbps speed or higher. IPv6 forwarding further ex...
With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective...
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...
—Performance degradation of memory-intensive programs caused by the LRU policy’s inability to handle weaklocality data accesses in the last level cache is increasingly serious ...
This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers...