3D video, which consists of a sequence of 3D mesh models, can provide detailed 3D information both in spatial and temporal domain. In this paper, a key frame extraction method has been developed to summarize 3D video by rate-distortion optimization. For this purpose, we introduce an effective feature vector extraction algorithm from 3D video. Prior to key frame extraction, shot detection is performed using the feature vectors as a pre-processing. Then, a rate-distortion (RD) curve is generated in each shot, where the locations of key frames are optimized. Lastly, R-D trade-off can be achieved by optimizing a cost function with a Lagrange multiplier. Our experimental results show the extracted key frames are compact and faithful to original 3D video.