We present a novel method for joint reconstruction of both image and motion in positron-emission-tomography (PET). Most other methods separate image from motion estimation: They use deformable image registration/optical flow techniques in order to estimate the motion from individually reconstructed gates. Then, the image is estimated based on this motion information. With these methods, a main problem lies in the motion estimation step, which is based on the noisy gated frames. The more noise is present, the more inaccurate the image registration becomes. As we show in a simulation study, our joint reconstruction approach overcomes these drawbacks and results in both visually and quantitatively better image quality. We attribute these results to the fact that for motion estimation always the currently best available image estimate is used and vice versa. Additionally, results for real dual respiratory and cardiac gated patient data are presented.