The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series d...
Thanawin Rakthanmanon, Bilson J. L. Campana, Abdul...
The currently booming search engine industry has determined many online organizations to attempt to artificially increase their ranking in order to attract more visitors to their ...
Users frequently modify a previous search query in hope of retrieving better results. These modifications are called query reformulations or query refinements. Existing research h...
This paper describes a hidden Markov model (HMM) based approach to perform search interface segmentation. Automatic processing of an interface is a must to access the invisible co...