We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
Dividing web pages into fragments has been shown to provide significant benefits for both content generation and caching. In order for a web site to use fragment-based content gen...
Lakshmish Ramaswamy, Arun Iyengar, Ling Liu, Fred ...
We present an approach of how to automatically extract an XML document structure from a conceptual data model that describes the content of the document. We use UML class diagrams ...
This paper proposes GBM (gravitation-based model), a physical model for information retrieval inspired by Newton’s theory of gravitation. A mapping is built in this model from c...
Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...