Many biological propositions can be supported by a variety of different types of evidence. It is often useful to collect together large numbers of such propositions, together with the evidence supporting them, into databases to be used in other analyses. Methods that automatically make preliminary choices about which propositions to include can be helpful, if they are accurate enough. This can involve weighing evidence of varying strength. We describe a method for learning a scoring function to weigh evidence of different types. The algorithm evaluates each source of evidence by the extent to which other sources tend to support it. The details are guided by a probabilistic formulation of the problem, building on previous theoretical work. We evaluate our method by applying it to predict protein-protein interactions in yeast, and using synthetic data.
Philip M. Long, Vinay Varadan, Sarah Gilman, Mark