We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
—With the vast increase in collection and storage of data, the problem of data summarization is most critical for effective data management. Since much of this data is categorica...
Wikipedia is one of the most popular information sources on the Web. The free encyclopedia is densely linked. The link structure in Wikipedia differs from the Web at large: interna...
— Efficient searching is one of the important design issues in peer-to-peer (P2P) networks. Among various searching techniques, semantic-based searching has drawn significant a...
In collaborative indexing systems users generate a big amount of metadata by labelling web-based content. These labels are known as tags and form a shared vocabulary. In order to u...