We introduce ClueMaker, the first language designed specifically for approximate record matching. Clues written in ClueMaker predict whether two records denote the same thing based...
Martin Buechi, Andrew Borthwick, Adam Winkel, Arth...
Principal components analysis (PCA) is one of the most widely used techniques in machine learning and data mining. Minor components analysis (MCA) is less well known, but can also...
Max Welling, Felix V. Agakov, Christopher K. I. Wi...
Software agents "living" and acting in a real world software environment, such as an operating system, a network, or a database system, can carry out many tasks for huma...
In spoken dialogue systems, it is important for a system to know how likely a speech recognition hypothesis is to be correct, so it can reprompt for fresh input, or, in cases wher...
Published experiments on spidering the Web suggest that, given training data in the form of a (relatively small) subgraph of the Web containing a subset of a selected class of tar...