A method is presented for automatically extracting key frames from an image sequence. The sequence is divided into clusters of frames with similar appearance, and the most central frame in each cluster defines a key frame. Clustering is done using an extension of the normalized cut segmentation technique based on the inter-frame similarities. The similarity between every pair of frames in the sequence is determined from the spatial image characteristics via a shape matching technique. Our algorithm is demonstrated successfully extracting 20 key frames for a tennis player in action over a 30 second (900 frame) video sequence.