Scalable Load and Store Processing in Latency Tolerant Processors

14 years 6 months ago

Download pages.cs.wisc.edu

Memory latency tolerant architectures support thousands of in-flight instructions without scaling cyclecritical processor resources, and thousands of useful instructions can complete in parallel with a miss to memory. These architectures however require large queues to track all loads and stores executed while a miss is pending. Hierarchical designs alleviate cycle time impact of these structures but the CAM and search functions required to enforce memory ordering and provide data forwarding place high demand on area and power. We present new load-store processing algorithms for latency tolerant architectures. We augment primary load and store queues with secondary buffers. The secondary load buffer is a set associative structure, similar to a cache. The secondary store buffer, the Store Redo Log, is a first-in first-out structure recording the program order of all stores completed in parallel with a miss, and has no CAM and search functions. Instead of the secondary store queue, a ca...

Amit Gandhi, Haitham Akkary, Ravi Rajwar, Srikanth

Real-time Traffic

Hardware | ISCA 2005 | Latency Tolerant Architectures | Search Functions | Secondary Store |

claim paper

Post Info
More Details (n/a)

Added	25 Jun 2010
Updated	25 Jun 2010
Type	Conference
Year	2005
Where	ISCA
Authors	Amit Gandhi, Haitham Akkary, Ravi Rajwar, Srikanth T. Srinivasan, Konrad K. Lai

Comments (0)

Sciweavers

Scalable Load and Store Processing in Latency Tolerant Processors

Hardware | ISCA 2005 | Latency Tolerant Architectures | Search Functions | Secondary Store |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers