We develop a general framework to automatically match electronic slides to the videos of corresponding presentations. Applications include supporting indexing and browsing of educational and corporate digital video libraries. Our approach extends previous work that matches slides based on visual features alone, and integrates multiple cues to further improve performance in more difficult cases. We model slide change in a presentation with a dynamic Hidden Markov Model (HMM) that captures the temporal notion of slide change and whose transition probabilities are adapted locally by using the camera events in the inference process. Our results show that combining multiple cues in a state model can greatly improve the performance in ambiguous cases.