Abstract. We address the problem of designing and building efficient custom Vl.Sl-besed processors to do computations on large multi-dimensional lattices. The design tradeoffs for two architectures which provide practical engines for lattice updates are deri ved and analyzed. We find that I/O constit utes the principal bottleneck of processors designed for lattice computations, and we derive upp er bounds on throughput for lattice updates based on Hong and Kung's graph-pebbling argument that models I/ O. In particular, we show that R = O(BS1/ d ), where R is the site update rate, B is the main memory bandwidth, S is the processor sto rage, and d is the dimension of the lattice.
Steven D. Kugelmass, Kenneth Steiglitz, Richard K.