We address the problem of selecting nondomain-specific language model training data to build auxiliary language models for use in tasks such as machine translation. Our approach i...
It is well known that occurrence counts of words in documents are often modeled poorly by standard distributions like the binomial or Poisson. Observed counts vary more than simpl...
In information retrieval, relevance of documents with respect to queries is usually judged by humans, and used in evaluation and/or learning of ranking functions. Previous work ha...
Jingfang Xu, Chuanliang Chen, Gu Xu, Hang Li, Elbi...
This article discusses a latent variable model for inference and prediction of symmetric relational data. The model, based on the idea of the eigenvalue decomposition, represents ...
Hiding data values in privacy-preserving data mining (PPDM) protects information against unauthorized attacks while maintaining analytical data properties. The most popular models...