We are trying to integrate television cooking videos with corresponding cookbooks. The cookbook has the advantage of the capability to easily browse through a cooking procedure, but understanding of actual cooking operations through written explanation is difficult. On the other hand, a video contains visual information that text cannot express sufficiently, but it lacks the ease to randomly browse through the procedures. We expect that their integration in the form of linking preparation steps (text) in a cookbook and video segments should result in complementing the drawbacks in each media. In this work, we propose a method to associate video segments with preparation steps in a supplementary cookbook by combining video structure analysis and text-based keyword matching. The result of an experiment showed high accuracy in association per video segments, i.e. annotating the video.