Abstract. Clustering algorithms for multidimensional numerical data must overcome special difficulties due to the irregularities of data distribution. We present a clustering algo...
This paper discusses the topic of dimensionality reduction for k-means clustering. We prove that any set of n points in d dimensions (rows in a matrix A ∈ Rn×d ) can be project...
The method of stable random projections is an efficient tool for computing the l distances using low memory, where 0 < 2 may be viewed as a tuning parameter. This method boil...
motif discovery problem abstracts the task of discovering short, conserved sites in genomic DNA. Pevzner and Sze recently described a precise combinatorial formulation of motif di...
This paper describes a research effort to improve the use of the cosine similarity information retrieval technique to detect unknown, known or variances of known rogue software by...