MIMO systems have attracted great attentions because of their huge capacity. The hardware implementation of MIMO decoder becomes a challenging task as the complexity of the MIMO systems increases. This paper presents hardware/software co-design architecture targeted on a single FPGA for two typical lattice decoding algorithms. Two levels of parallelisms are analyzed for an efficient implementation with the preprocessing part on embedded MicroBlaze soft processor and the decoder part on customized hardware. The system prototypes of the AV and VB decoders show that they support up to 34.2 Mbps and 3.15 Mbps data rate at 20dB SNR respectively on XUP Virtex-II pro developing board with an xc2vp30 FPGA, which are 19 and 16 times faster than their respective implementations on a DSP. The performance in terms of resource utilization and bit error rate are also compared between these two algorithms. Keywords Hardware/Software co-design, MIMO, Lattice decoder, decoding algorithms, FPGA, DSP