This paper presents a novel architecture for on-chip neural network training using particle swarm optimization (PSO). PSO is an evolutionary optimization algorithm with a growing field of applications which has been recently used to train neural networks. The architecture exploits PSO algorithm to evolve network weights as well as a method called layer partitioning to implement neural networks. In the proposed method, a neural network is partitioned into groups of neurons and the groups are sequentially mapped to available functional units. Thus, the architecture is reconfigurable for training and implementing different multilayer feedforward neural networks without the need for modifying the architecture. The implementation is intended for real-time applications regarding hardware cost and speed. The results show that the proposed system provides a trade-off between resource requirements and speed.