Abstract. Program behaviors reveal that programs have different sources requirement at different phases, even at continuous clocks. It is not a reasonable way to run different programs on constant hardware resources. So sharing feasible degree of hardware may get more benefits for programs. This paper proposes architecture to share function units between neighbor cores in CMP to improve chip performance. Function units are central units on the core, it take little area and is not the performance critical part of core, but improving function units' utilization can improve other units' efficiency and core performance. In our design, share priority guarantees the local thread would not be influenced by threads in neighbor cores. Share latency is resolved by early share decision made and direct data path. The evaluation shows that the proposal is good for function unit intensive program and can drive other units more efficient.