We show how to build hierarchical, reduced-rank representation for large stochastic matrices and use this representation to design an efficient algorithm for computing the largest...
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
Estimation via sampling out of highly selective join queries is well known to be problematic, most notably in online aggregation. Without goal-directed sampling strategies, samples...
Background: Approximately 5% of Pfam families are enzymatic, but only a small fraction of the sequences within these families (<0.5%) have had the residues responsible for cata...
The lack of standards for Romanization of Thai proper names makes searching activity a challenging task. This is particularly important when searching for people-related documents ...