We explore the use of the landing page content in sponsored search ad selection. Specifically, we compare the use of the ad’s intrinsic content to augmenting the ad with the wh...
Yejin Choi, Marcus Fontoura, Evgeniy Gabrilovich, ...
With the increasing popularity of location-based services, such as tour guide and location-based social network, we now have accumulated many location data on the Web. In this pap...
Vincent Wenchen Zheng, Yu Zheng, Xing Xie, Qiang Y...
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Cross-site scripting flaws have now surpassed buffer overflows as the world’s most common publicly-reported security vulnerability. In recent years, browser vendors and resea...
We present the conceptual framework of the Social Honeypot Project for uncovering social spammers who target online communities and initial empirical results from Twitter and MySp...
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
Service descriptions allow designers to document, understand, and use services, creating new useful and complex services with aggregated business value. Unlike RPC-based services,...
Clustering and retrieval of web pages dominantly relies on analyzing either the content of individual web pages or the link structure between them. Some literature also suggests t...
Result diversity is a topic of great importance as more facets of queries are discovered and users expect to find their desired facets in the first page of the results. However,...
The Web of Data has emerged as a way of exposing structured linked data on the Web. It builds on the central building blocks of the Web (URIs, HTTP) and benefits from its simplic...