Enterprise corpora contain evidence of what employees work on and therefore can be used to automatically find experts on a given topic. We present a general approach for representing the knowledge of a potential expert as a mixture of language models from associated documents. First we retrieve documents given the expert’s name using a generative probabilistic technique and weight the retrieved documents according to expert-specific posterior distribution. Then we model the expert indirectly through the set of associated documents, which allows us to exploit their underlying structure and complex language features. Experiments show that our method has excellent performance on TREC 2005 expert search task and that it effectively collects and combines evidence for expertise in a heterogeneous collection.
Desislava Petkova, W. Bruce Croft