The family of threshold algorithm (i.e., TA) has been widely studied for efficiently computing top-k queries. TA uses a sort-merge framework that assumes data lists are pre-sorted...
Similarity search and similarity join on strings are important for applications such as duplicate detection, error detection, data cleansing, or comparison of biological sequences....
Sketching techniques can provide approximate answers to aggregate queries either for data-streaming or distributed computation. Small space summaries that have linearity propertie...
A novel algorithm is proposed to learn pattern similarities for texture image retrieval. Similar patterns in di erent texture classes are grouped into a cluster in the feature spac...
Frequent-pattern mining has been studied extensively on scalable methods for mining various kinds of patterns including itemsets, sequences, and graphs. However, the bottleneck of...