The unarguably fast, and continuous, growth of the volume of indexed (and indexable) documents on the Web poses a great challenge for search engines. This is true regarding not on...
In this paper we present the Infocious Web search engine [23]. Our goal in creating Infocious is to improve the way people find information on the Web by resolving ambiguities pre...
The success of the Semantic Web crucially depends on the easy creation, integration and use of semantic data. For this purpose, we consider an integration scenario that defies cor...
Observed in many applications, there is a potential need of extracting a small set of frequent patterns having not only high significance but also low redundancy. The significance...
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...