—Super-scalar, out-of-order processors that can have tens of read and write requests in the execution window place significant demands on Memory Level Parallelism (MLP). Multi- ...
George C. Caragea, Alexandros Tzannes, Fuat Keceli...
Abstract. This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 an...
Kazuhisa Ishizaka, Takamichi Miyamoto, Jun Shirako...
The explosive growth in the performance of microprocessors and networks has created a new opportunity to reduce the latency of fine-grain communication. Microprocessor clock speed...
Future CMPs will combine many simple cores with deep cache hierarchies. With more cores, cache resources per core are fewer, and must be shared carefully to avoid poor utilization...
Junli Gu, Steven S. Lumetta, Rakesh Kumar, Yihe Su...
While NoCs are efficient in delivering high throughput point-to-point traffic, their multi-hop operation is too slow for latency sensitive signals. In addition, NoCS are inefficie...
Ran Manevich, Isask'har Walter, Israel Cidon, Avin...