Although there are many neural network FPGA architectures, there is no framework for designing large, high-performance neural networks suitable for the real world. In this paper, we present two concepts to support a multi-FPGA architecture for stochastic Restricted Boltzmann Machines (RBM), a popular type of neural network. First, a hardware core, called the kth Stage Piecewise Linear Interpolator, is used to implement a high-precision, pipelined function generator. The interpolator increases the resolution of a Look Up Table implementation, guaranteeing an additional bit of precision for every pipeline stage. This function generator is used to implement a sigmoid function required in stochastic node selection. Next, a partitioning algorithm is used to efficiently divide a RBM amongst multiple FPGAs. The partitioning algorithm optimizes performance by minimizing the inter-FPGA communication. The architecture is tested on the Berkeley Emulation Engine 2 running at 100MHz. One board su...
Daniel L. Ly, Paul Chow