In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...
Background: Within the peer-reviewed literature, associations between two things are not always recognized until commonalities between them become apparent. These commonalities ca...
Background: Population structure analysis is important to genetic association studies and evolutionary investigations. Parametric approaches, e.g. STRUCTURE and L-POP, usually ass...
Perhaps the most common question that a microarray study can ask is, “Between two given biological conditions, which genes exhibit changed expression levels?” Existing methods...
Will Sheffler, Eli Upfal, John Sedivy, William Sta...
We consider the problem of computing information theoretic functions such as entropy on a data stream, using sublinear space. Our first result deals with a measure we call the &quo...