A low cost, multithreaded processing-in-memory system

15 years 10 months ago

Download www.cs.utah.edu

This paper discusses die cost vs. performance tradeoﬀs for a PIM system that could serve as the memory system of a host processor. For an increase of less than twice the cost of a commodity DRAM part, it is possible to realize a performance speedup of nearly a factor of 4 on irregular applications. This cost eﬃciency derives from developing a custom multithreaded processor architecture and implementation style that is well-suited for embedding in a memory. Speciﬁcally, it takes advantage of the low latency and high row bandwidth to both simplify processor design—reducing area—as well as to improve processing throughput. To support our claims of cost and performance, we have used simulation, analysis of existing chips, and also designed and fully implemented a prototype chip, PIM Lite.

Jay B. Brockman, Shyamkumar Thoziyoor, Shannon K.

Real-time Traffic

Commodity Dram Part | Cost Vs | Custom Multithreaded Processor | Hardware | WMPI 2004 |

claim paper

» Hybrid multithreading for VLIW processors

» The Thread Migration Mechanism of DSMPEPE

» Hoard A Scalable Memory Allocator for Multithreaded Applications

» Lightweight Lexical Closures for Legitimate Execution Stack Access

» Improving server software support for simultaneous multithreaded processors

» mSWAT lowcost hardware fault detection and diagnosis for multicore systems

» Kendo efficient deterministic multithreading in software

» Accurate branch prediction for short threads

Post Info
More Details (n/a)

Added	30 Jun 2010
Updated	30 Jun 2010
Type	Conference
Year	2004
Where	WMPI
Authors	Jay B. Brockman, Shyamkumar Thoziyoor, Shannon K. Kuntz, Peter M. Kogge

Comments (0)

Sciweavers

A low cost, multithreaded processing-in-memory system

Commodity Dram Part | Cost Vs | Custom Multithreaded Processor | Hardware | WMPI 2004 |

Explore & Download

Productivity Tools

Sciweavers