This paper proposes a novel Behavioral Synthesis method that improves performance of synthesized circuits utilizing specialized functional units effectively. Specialized functional units (e.g. Multiply-Accumulator) are designed for specific operation patterns to achieve shorter delay and/or smaller area than cascaded basic functional units. Almost all conventional methods cannot use specialized functional units effectively under a total area constraint because of their less flexibility for resource sharing. The proposed method makes it possible to solve module selection, scheduling, and functional unit allocation problems utilizing specialized functional units in practical time with some heuristics, and to reduce the number of clock cycles under total area and clock cycle time constraints. Experimental results show that the proposed method has achieved up to 35% and on average 14% reduction of the number of cycles with specialized functional units in practical time.