Sampling is a widely used technique to increase efficiency in database and data mining applications operating on large dataset. In this paper we present a scalable sampling imple...
Spatial databases support a variety of geometric queries on point data such as range searches, nearest neighbor searches, etc. Balanced Aspect Ratio (BAR) trees are hierarchical sp...
The problem of identifying approximately duplicate objects in databases is an essential step for the information integration process. Most existing approaches have relied on gener...
Abstract. For a book, the title and abstract provide a good first impression of what to expect from it. For a database, getting a first impression is not so straightforward. Whil...
Background: Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require con...