Main memory accesses continue to be a significant bottleneck for applications whose working sets do not fit in second-level caches. With the trend of greater associativity in seco...
Growth and usage trends for large decision support databases indicate that there is a need for architectures that scale the processing power as the dataset grows. To meet this nee...
Memorylatency isbecominganincreasingly importantperformance bottleneck, especially in multiprocessors. One technique for tolerating memory latency is multithreading, whereby we sw...
We consider a variety of dynamic, hardware-based methods for exploiting load/store parallelism, including mechanisms that use memory dependence speculation. While previous work ha...
We are attacking the memory bottleneck by building a “smart” memory controller that improves effective memory bandwidth, bus utilization, and cache efficiency by letting appl...
Binu K. Mathew, Sally A. McKee, John B. Carter, Al...
Processor architectures with tens to hundreds of arithmetic units are emerging to handle media processing applications. These applications, such as image coding, image synthesis, ...
Scott Rixner, William J. Dally, Brucek Khailany, P...
Efficiency of synchronization mechanisms can limit the parallel performance of many shared-memory applications. In addition, the ever increasing performance gap between processor...
Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction, and executes speculatively its data-dependent instructions based on ...
This paper presents flit-reservation flow control, in which control flits traverse the network in advance of data flits, reserving buffers and channel bandwidth. Flit-reservation ...