In this paper, we explored how to use meta-data information in information retrieval task. We presented a new language model that is able to take advantage of the category information for documents to improve the retrieval accuracy. We compared the new language model with the traditional language model over the TREC4 dataset where the collection information for documents is obtained using the k-means clustering method. The new language model outperforms the traditional language model, which verifies our statement. Categories and Subject Descriptors Language Models Keywords Language model for IR
Rong Jin, Luo Si, Alexander G. Hauptmann, James P.