We present a parallel code generation algorithm for complete applications and a new experimental methodology that tests the efficacy of our approach. The algorithm optimizes for d...
Multicore processors have become ubiquitous in today’s systems, but exploiting the parallelism they offer remains difficult, especially for legacy application and applications ...
Recent I/O technologies such as PCI-Express and 10Gb Ethernet enable unprecedented levels of I/O bandwidths in mainstream platforms. However, in traditional architectures, memory ...
The observations in many applications consist of counts of discrete events, such as photons hitting a detector, which cannot be effectively modeled using an additive bounded or Ga...
Zachary T. Harmany, Roummel F. Marcia, Rebecca Wil...
Many parallel systems offer a simple view of memory: all storage cells are addresseduniformly. Despite a uniform view of the memory, the machines differsignificantly in theirmemo...