Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
Social network systems on the Internet, such MySpace and LinkedIn, are growing in popularity around the world. The level of such activity is now comparable to that associated with...
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
The spatio-textual spreadsheet is a conventional spreadsheet where spatial attribute values are specified textually. Techniques are presented to automatically find the textually-s...
Hanan Samet, Jagan Sankaranarayanan, Jon Sperling,...
The ranking function used by search engines to order results is learned from labeled training data. Each training point is a (query, URL) pair that is labeled by a human judge who...
Rakesh Agrawal, Alan Halverson, Krishnaram Kenthap...