Instruction set customization accelerates the performance of applications by compressing the length of critical dependence paths and reducing the demands on processor resources. W...
Sami Yehia, Nathan Clark, Scott A. Mahlke, Kriszti...
Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the l...
Marek Olszewski, Qin Zhao, David Koh, Jason Ansel,...
Consider a multithreaded parallel application running inside a multicore virtual machine context that is itself hosted on a multi-socket multicore physical machine. How should the...
As transistor density continues to grow at an exponential rate in accordance to Moore’s law, the goal for many Chip Multi-Processor (CMP) systems is to scale the number of on-ch...
Brian M. Rogers, Anil Krishna, Gordon B. Bell, Ken...
Real-time Garbage Collection (RTGC) has recently advanced to the point where it is being used in production for financial trading, military command-and-control, and telecommunicat...
Joshua S. Auerbach, David F. Bacon, Perry Cheng, D...