Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
The Deep Web, i.e., content hidden behind HTML forms, has long been acknowledged as a significant gap in search engine coverage. Since it represents a large portion of the structu...
Jayant Madhavan, David Ko, Lucja Kot, Vignesh Gana...
Understanding goals and preferences behind a user's online activities can greatly help information providers, such as search engine and E-Commerce web sites, to personalize c...
Honghua (Kathy) Dai, Lingzhi Zhao, Zaiqing Nie, Ji...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Many important search tasks require multiple search sessions to complete. Tasks such as travel planning, large purchases, or job searches can span hours, days, or even weeks. Inev...
Eugene Agichtein, Ryen W. White, Susan T. Dumais, ...