FPGA (Field Programmable Gate Array) based reconfigurable processor has been shown to meet the increasingly challenging performance targets and shorter time-to-market pressures. In this paper, we propose a method to rapidly estimate the FPGA area costs of custom instructions without the need for hardware synthesis. The proposed estimation technique relies on a novel approach to partition the custom instruction data-paths into a set of clusters, where each cluster can be realized using an FPGA logic element or a coarse-grained arithmetic unit. Experiments based on 20 custom instructions reveal that the estimation results show an average of only 8% increase in the area costs when compared with the corresponding hardware synthesized results. In addition, we show that the maximum FPGA area utilized by custom instructions of each of the seven applications examined is equivalent to about 1000 Xilinx FPGA logic elements.