The k-means algorithm is widely used for clustering, compressing, and summarizing vector data. In this paper, we propose a new acceleration for exact k-means that gives the same a...
Associative Language Descriptions are a recent grammar model, theoretically less powerful than Context Free grammars, but adequate for describing the syntax of programming languag...
Abstract-- Biological sequences from different species are called orthologs if they evolved from a sequence of a common ancestor species and they have the same biological function....
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
Data mining algorithms use various Trie and bitmap-based representations to optimize the support (i.e., frequency) counting performance. In this paper, we compare the memory requi...