Users prefer to navigate subjects from organized topics in an abundance resources than to list pages retrieved from search engines. We propose a framework to cluster frequent items...
The IDEX system is a prototype of an interactive dynamic Information Extraction (IE) system. A user of the system expresses an information request in the form of a topic descripti...
: Dynamic Web data sources – sometimes known collectively as the Deep Web – increase the utility of the Web by providing intuitive access to data repositories anywhere that Web...
Daniel Rocco, James Caverlee, Ling Liu, Terence Cr...
Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned websearch ranking — a domain notorious for very large data sets. ...
Stephen Tyree, Kilian Q. Weinberger, Kunal Agrawal...
We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independe...
Josiane Xavier Parreira, Sebastian Michel, Gerhard...