A key component of BM25 contributing to its success is its sub-linear term frequency (TF) normalization formula. The scale and shape of this TF normalization component is controll...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...