This paper presents an architecture for programmable systolic arrays that provides simple and e cient systolic communication. The Brown Systolic Array is a linear implementation of this Systolic Shared Register architecture; a working 470-processor prototype system performs 108 MOPS. A 32-chip, 1504-processor implementation could provide 5 GOPS of systolic co-processing power on a single board.
Richard Hughey, Daniel P. Lopresti