Heterogeneous and dirty data is abundant. It is stored under different, often opaque schemata, it represents identical real-world objects multiple times, causing duplicates, and ...
Alexander Bilke, Jens Bleiholder, Christoph Bö...
This paper presents a general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from larges...
We present a method for computing normals and colors from multiple sets of photometric data, that are consistent with each other and an underlying lower resolution mesh. Normals a...
The growth of the web has directly influenced the increase in the availability of relational data. One of the key problems in mining such data is computing the similarity between o...
Pradeep Muthukrishnan, Dragomir R. Radev, Qiaozhu ...
We present a novel classification-based algorithm called GeneClass for learning to predict gene regulatory response. Our approach is motivated by the hypothesis that in simple orga...
Manuel Middendorf, Anshul Kundaje, Chris Wiggins, ...