Information Retrieval systems can be improved by exploiting context information such as user and document features. This article presents a model based on overlapping probabilistic...
Documents often display a structure, e.g., several sections, each with several subsections and so on. Taking into account the structure of a document allows the retrieval process ...
The traditional weighting schemes used in text categorization for the vector space model (VSM) cannot exploit information intrinsic to texts obtained through on-line handwriting r...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
In e-business development, semantics-oriented document exchange is becoming important, because it can support crossdomain user connection, business transaction and collaboration. ...