Multi-processor architectures have gained interest recently because of their ability to exploit programmable silicon parallelism at acceptable power-efficiency figures. Despite the potential benefit they offer over single-processor architectures, it is unresolved how one can write compact and efficient programs for multiple parallel cores. In this paper, we propose the use of a synchronous hardware description language to program a network of small PicoBlaze processors. The partitioning of a multiprocessor program over multiple cores is straightforward because the input specification is fully parallel. A systematic transformation process converts the parallel input specification into concurrent PicoBlaze programs. We demonstrate the mapping of a cryptographic design (AES) onto four picoblaze processors, showing almost linear speedup over an equivalent single-core design.