Low-Complexity Regions (LCRs) of biological sequences are the main source of false positives in similarity searches for biological sequence databases. We consider the problem of ï...
This paper extends previous work on document retrieval and document type classification, addressing the problem of ‘typed search’. Specifically, given a query and a designated ...
Jun Xu, Yunbo Cao, Hang Li, Nick Craswell, Yalou H...
When searching databases of nucleotide or protein sequences, ï¬nding a local alignment of two sequences is one of the main tasks. Since the sizes of available databases grow const...
All pairs similarity search is the problem of ï¬nding all pairs of records that have a similarity score above the speciï¬ed threshold. Many real-world systems like search engine...
Background: The fingerprint of a molecule is a bitstring based on its structure, constructed such that structurally similar molecules will have similar fingerprints. Molecular fin...
Thomas G. Kristensen, Jesper Nielsen, Christian N....