Single-chip parallel processing requires high bandwidth between processors and on-chip memory modules. A recently proposed Mesh-of-Trees (MoT) network provides high throughput and...
This paper presents a hardware-based dynamic optimizer that continuously optimizes an application’s instruction stream. In continuous optimization, dataflow optimizations are p...
Brian Fahs, Todd M. Rafacz, Sanjay J. Patel, Steve...
A Java bytecode-to-C ahead-of-time compiler (AOTC) can improve the performance of a Java virtual machine (JVM) by translating bytecode into C code, which is then compiled into mac...
One of the essential features in modern computer systems is context switching, which allows multiple threads of execution to time-share a limited number of processors. While very ...
Fang Liu, Fei Guo, Yan Solihin, Seongbeom Kim, Abd...
As the desire of scientists to perform ever larger computations drives the size of today’s high performance computers from hundreds, to thousands, and even tens of thousands of ...