We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...
Michael Cameron, Yaniv Bernstein, Hugh E. Williams
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or compu...
William E. Allcock, Joseph Bester, John Bresnahan,...
Abstract. We present a new approach for developing robust software applications that breaks dependences on the failed parts of an application’s execution to allow the rest of the...
Subspace learning is very important in today's world of information overload. Distinguishing between categories within a subset of a large data repository such as the web and ...
Nandita Tripathi, Michael P. Oakes, Stefan Wermter
The Web is rapidly moving towards a platform for mass collaboration in content production and consumption. Fresh content on a variety of topics, people, and places is being create...
Yih-Farn Robin Chen, Giuseppe Di Fabbrizio, David ...