Sciweavers

187 search results - page 36 / 38
» Entity categorization over large document collections
Sort
View
KDD
2004
ACM
210views Data Mining» more  KDD 2004»
14 years 8 months ago
Probabilistic author-topic models for information discovery
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...
BMCBI
2005
73views more  BMCBI 2005»
13 years 7 months ago
An analysis of extensible modelling for functional genomics data
Background: Several data formats have been developed for large scale biological experiments, using a variety of methodologies. Most data formats contain a mechanism for allowing e...
Andrew R. Jones, Norman W. Paton
VLDB
1999
ACM
118views Database» more  VLDB 1999»
13 years 11 months ago
Similarity Search in High Dimensions via Hashing
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing ...
Aristides Gionis, Piotr Indyk, Rajeev Motwani
IMC
2004
ACM
14 years 1 months ago
An analysis of live streaming workloads on the internet
In this paper, we study the live streaming workload from a large content delivery network. Our data, collected over a 3 month period, contains over 70 million requests for 5,000 d...
Kunwadee Sripanidkulchai, Bruce M. Maggs, Hui Zhan...
WWW
2007
ACM
14 years 8 months ago
Query-driven indexing for peer-to-peer text retrieval
We describe a query-driven indexing framework for scalable text retrieval over structured P2P networks. To cope with the bandwidth consumption problem that has been identified as ...
Gleb Skobeltsyn, Toan Luu, Karl Aberer, Martin Raj...