The paper presents a high-performance architecture of the bit-plane coder for the embedded block coding algorithm in JPEG 2000. The architecture adopts a pipeline structure and is dedicated to generate two context-symbol pairs per clock cycle. A novel method called Dynamic Significance State Restoring (DSSR) allows reduction of on-chip memories. The overall design is described in VHDL and synthesized for FPGA and ASIC technologies. Simulation results show that for FPGA Stratix devices, the engine can process about 22 million samples at the frequency of 66 MHz.