We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Many interesting Web-based AI problems require the ability to collect, store and process large text datasets. To address this problem, we have developed Slashpack, an integrated t...
Christopher H. Brooks, Monica Agarwal, Jason Endo,...
Abstract. As the type of content available on the web is becoming increasingly diverse, a particular challenge is to properly determine the types of documents sought by a user, tha...
Shanu Sushmita, Benjamin Piwowarski, Mounia Lalmas
We introduce a method for learning query transformations that improves the ability to retrieve answers to questions from an information retrieval system. During the training stage...
Measuring the information retrieval effectiveness of Web search engines can be expensive if human relevance judgments are required to evaluate search results. Using implicit user ...