Sciweavers

IPPS
2002
IEEE

Parallel EST Clustering

14 years 4 months ago
Parallel EST Clustering
Expressed sequence tags, abbreviated ESTs, are DNA fragments experimentally derived from expressed portions of genes. Clustering of ESTs is essential for gene recognition and understanding important genetic variations such as those resulting in diseases. In this paper, we present the design and development of a parallel software system for EST clustering. The novel features of our approach include 1) space efficient algorithms to keep the space requirement linear in the size of the input data set, 2) a combination of algorithmic techniques to reduce the total work without sacrificing the quality of EST clustering, and 3) use of parallel processing to reduce the run-time and facilitate the clustering of large data sets. Using a combination of these techniques, we report the clustering of 50,000 maize ESTs in 16 minutes on a 32-processor IBM SP. To our knowledge, this is the first effort in building a parallel software system for EST clustering.
Anantharaman Kalyanaraman, Srinivas Aluru, Suresh
Added 15 Jul 2010
Updated 15 Jul 2010
Type Conference
Year 2002
Where IPPS
Authors Anantharaman Kalyanaraman, Srinivas Aluru, Suresh C. Kothari
Comments (0)