The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
A relational probability tree (RPT) is a type of decision tree that can be used for probabilistic classification of instances with a relational structure. Each leaf of an RPT cont...
Neighbor search is a fundamental task in machine learning, especially in classification and retrieval. Efficient nearest neighbor search methods have been widely studied, with the...
In large scale online systems like Search, eCommerce, or social network applications, user queries represent an important dimension of activities that can be used to study the imp...
We extend Haskell with regular expression patterns. Regular expression patterns provide means for matching and extracting data which goes well beyond ordinary pattern matching as ...