Abstract— NnSP is a stream-based programmable and codelevel statically reconfigurable processor for realization of neural networks in embedded systems. NnSP is provided with a neuralnetwork-to-stream compiler and a hardware core builder. The NnSP stream compiler makes it possible to realize various neural networks using NnSP. On the other hand, the NnSP builder makes the NnSP processor an IP core that can be restructured to satisfy different demands and constraints. This paper presents the architecture of the NnSP processor, the streaming mechanism, and the builder facilities. Also, synthesis results of a 64-PE NnSP on a 0.18µm standard-cell library are presented. The obtained results show that a 64-PE NnSP can perform computations of 25.6 giga connections in a second, while its throughput is upto