There are numerous applications where there is a need to rapidly infer a story about a given subject from a given set of potentially heterogeneous data sources. In this paper, we formally define a story to be a set of facts about a given subject that satisfies a “story length” constraint. An optimal story is a story that maximizes the value of an objective function measuring the goodness of a story. We present algorithms to extract stories from text and other data sources. We also develop an algorithm to compute an optimal story, as well as three heuristic algorithms to rapidly compute a suboptimal story. We run experiments to show that constructing stories can be efficiently performed and that the stories constructed by these heuristic algorithms are high quality stories. We have built a prototype STORY system based on our model — we briefly describe the prototype as well as one application in this paper.
Marat Fayzullin, V. S. Subrahmanian, Massimiliano