There is still much performance to be gained by out-of-order processors with wider issue widths. However, traditional methods of increasing issue width do not scale; that is, they...
Recently, Network-on-Chip (NoC) architectures have gained popularity to address the interconnect delay problem for designing CMP / multi-core / SoC systems in deep sub-micron tech...
Dongkook Park, Soumya Eachempati, Reetuparna Das, ...
Modern processors rely on memory dependence prediction to execute load instructions as early as possible, speculating that they are not dependent on an earlier, unissued store. To...
Franziska Roesner, Doug Burger, Stephen W. Keckler
Writing shared-memory parallel programs is error-prone. Among the concurrency errors that programmers often face are atomicity violations, which are especially challenging. They h...
Brandon Lucia, Joseph Devietti, Karin Strauss, Lui...
Multicore processors have emerged as a powerful platform on which to efficiently exploit thread-level parallelism (TLP). However, due to Amdahl’s Law, such designs will be incr...
As CMOS technology scales and more transistors are packed on to the same chip, soft error reliability has become an increasingly important design issue for processors. Prior resea...
Xiaodong Li, Sarita V. Adve, Pradip Bose, Jude A. ...
Server storage systems use a large number of disks to achieve high performance, thereby consuming a significant amount of power. In this paper, we propose to significantly reduc...
Sriram Sankar, Sudhanva Gurumurthi, Mircea R. Stan
We analyze circuits for a number of kernels from popular quantum computing applications, characterizing the hardware resources necessary to take ancilla preparation off the critic...
Nemanja Isailovic, Mark Whitney, Yatish Patel, Joh...
Three-dimensional integration enables stacking memory directly on top of a microprocessor, thereby significantly reducing wire delay between the two. Previous studies have examin...