Abstract. The term Deep Web (sometimes also called Hidden Web) refers to the data content that is created dynamically as the result of a specific search on the Web. In this respec...
Increasingly, many data sources appear as online databases, hidden behind query forms, thus forming what is referred to as the deep web. It is desirable to have systems that can pr...
The deep web contains an order of magnitude more information than the surface web, but that information is hidden behind the web forms of a large number of web sites. Metasearch e...
Jeffrey P. Bigham, Ryan S. Kaminsky, Jeffrey Nicho...
Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper i...
—The key to Deep Web crawling is to submit promising keywords to query form and retrieve Deep Web content efficiently. To select keywords, existing methods make a decision based ...
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...