Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance...
Dana Vantrease, Robert Schreiber, Matteo Monchiero...
Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data...
Just-in-time (JIT) compilation has previously been used in many applications to enable standard software binaries to execute on different underlying processor architectures. Howev...
Despite the large research efforts in the SW–DSM community, this technology has not yet been adapted widely for significant codes beyond benchmark suites. One of the reasons co...