Garbage collection considerably increases programmer productivity and software quality. However, it is difficult to implement garbage collection both efficiently and suitably for real-time systems. Today, garbage collection is exclusively realized in software and either fails to guarantee a small upper bound for pause times or suffers from considerable synchronization overhead. In this paper, we present the design and implementation of an on-chip garbage collection coprocessor that closely cooperates with the main processor. The benefits of this configuration include low garbage collection overhead, low-cost synchronization of collector and application programs, and hard real-time capabilities. We successfully realized the garbage collection coprocessor along with a pipelined RISC processor on a single FPGA. Performance measurements on the prototype show that the longest pauses caused by the garbage collector are less than 500 clock cycles and that the total runtime overhead is as...