Although silicon optical technology is still in its formative stages, and the more near-term application is chip-to-chip communication, rapid advances have been made in the develo...
Nevin Kirman, Meyrem Kirman, Rajeev K. Dokania, Jo...
Designing and optimizing high performance microprocessors is an increasingly difficult task due to the size and complexity of the processor design space, high cost of detailed si...
P. J. Joseph, Kapil Vaswani, Matthew J. Thazhuthav...
We present Adaptive Stream Detection, a simple technique for modulating the aggressiveness of a stream prefetcher to match a workload’s observed spatial locality. We use this co...
With the trend towards increasing number of processor cores in future chip architectures, scalable directory-based protocols for maintaining cache coherence will be needed. Howeve...
This paper presents and studies a distributed L2 cache management approach through OS-level page allocation for future many-core processors. L2 cache management is a crucial multi...
Instruction aggregation—the grouping of multiple operations into a single processing unit—is a technique that has recently been used to amplify the bandwidth and capacity of c...
The large working sets of commercial and scientific workloads stress the L2 caches of Chip Multiprocessors (CMPs). Some CMPs use a shared L2 cache to maximize the on-chip cache c...
Bradford M. Beckmann, Michael R. Marty, David A. W...
We introduce virtually-pipelined memory, an architectural technique that efficiently supports high-bandwidth, uniform latency memory accesses, and high-confidence throughput eve...
3D die stacking is an exciting new technology that increases transistor density by vertically integrating two or more die with a dense, high-speed interface. The result of 3D die ...
Bryan Black, Murali Annavaram, Ned Brekelbaum, Joh...
The continuing VLSI technology scaling leads to increasingly significant power supply fluctuations, which need to be modeled accurately in circuit design and verification. Meanwhi...