The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
We present algorithms for fast quantile and frequency estimation in large data streams using graphics processor units (GPUs). We exploit the high computational power and memory ba...
Naga K. Govindaraju, Nikunj Raghuvanshi, Dinesh Ma...
Repetitive and ambiguous visual structures in general pose a severe problem in many computer vision applications. Identification of incorrect geometric relations between images s...
Christopher Zach, Manfred Klopschitz, Marc Pollefe...
1 We propose a new framework for the summarization of XML document properties called EXsum (Element-wise XML summarization), which can capture statistical information of all import...
Abstract. This paper is concerned with algorithms for the logical generalisation of probabilistic temporal models from examples. The algorithms combine logic and probabilistic mode...