Automatic indexing to video data is in strong demand to cope with the increasing amount. We propose an automatic indexing method for television news video, which indexes to shots considering the correspondence of image contents and semantic attributes of keywords. This is realized by first, (1) classifying shots by graphical feature, and (2) analyzing semantic attributes of accompanying captions. Next, keywords are selectively indexed to shots according to appropriate correspondence of typical shot classes and semantic attributes of keywords. The method was applied to 75 minutes of actual news video, and resulted in indexing successfully to approximately 50% of the typical shots (60% of the shots were classified as typical), and 80% of the typical shots where captions existed.