A high performance communication architecture, SAMBA-bus, is proposed in this paper. In SAMBA-bus, multiple compatible bus transactions can be performed simultaneously with only a single bus access grant from the bus arbiter. Experimental results show that, compared with a traditional bus architecture, the SAMBA-bus architecture can have up to 3.5 times improvement in the effective bandwidth, and up to 15 times reduction in the average communication latency. In addition, the performance of SAMBA-bus architecture is affected only slightly by arbitration latency, because bus transactions can be performed without waiting for the bus access grant from the arbiter. This feature is desirable in SoC designs with large numbers of modules and long communication delay between modules and the bus arbiter.