Opening up large amounts of loosely structured information for easy access and use is a complex problem. This paper describes two systems that address different aspects of the pro...
A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifi...
While the decision tree is an effective representation that has been used in many domains, a tree can often encode a concept inefficiently. This happens when the tree has to repres...
We develop a new component analysis framework, the Noisy-Or Component Analyzer (NOCA), that targets high-dimensional binary data. NOCA is a probabilistic latent variable model tha...
Self-focus is a novel way of understanding a type of bias in community-maintained Web 2.0 graph structures. It goes beyond previous measures of topical coverage bias by encapsulat...