This paper describes a method of extracting katakana words and phrases, along with their English counterparts from non-aligned monolingual web search engine query logs. The method...
Clustering separates unrelated documents and groups related documents, and is useful for discrimination, disambiguation, summarization, organization, and navigation of unstructure...
We describe an algorithm for finding approximate seeds for DNA homology searches. In contrast to previous algorithms that use exact or spaced seeds, our approximate seeds may conta...
Observing that current Global Similarity Measures (GSM) which average the effect of few significant differences on all dimensions may cause possible performance limitation, we prop...
Zi Huang, Heng Tao Shen, Dawei Song, Xue Li, Stefa...
Using a set of model landscapes we examine how different mutation rates affect different search metrics. We show that very universal heuristics, such as 1/N and the error threshol...