Language models used in current automatic speech recognition systems are trained on general-purpose corpora and are therefore not relevant to transcribe spoken documents dealing with successive precise topics, such as long multimedia streams, frequently tackling reports and debates. To overcome this problem, this paper shows that Web resources and natural language processing techniques can be effective to automatically collect a topic specific corpora from the Internet in order to adapt the baseline language model of an automatic speech recognition system. We detail how to characterize the topic of a segment and how to collect Web pages from which a topicspecific language model can be trained. We finally present experiments where an adapted language model is obtained by combining the topic-specific language model with the general purpose one to obtain new transcriptions. The results show that our topic adaptation technique leads to significant transcription quality gains.