Display ads on the Internet are often sold in bundles of thousands or millions of impressions over a particular time period, typically weeks or months. Ad serving systems that ass...
Record linkage analysis, which matches records referring to the same real world entities from different data sets, is an important task in data integration. Uncertainty often exi...
We present an original approach to identifying sequence variants in a mixed DNA population from sequence trace data. The heart of the method is based on parsimony: given a wildtyp...
After a classifier is trained using a machine learning algorithm and put to use in a real world system, it often faces noise which did not appear in the training data. Particularl...
We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, so...