Cloaking is a common “bait-and-switch” technique used to hide the true nature of a Web site by delivering blatantly different semantic content to different user segments. It is often used in search engine optimization (SEO) to obtain user traffic illegitimately for scams. In this paper, we measure and characterize the prevalence of cloaking on different search engines, how this behavior changes for targeted versus untargeted advertising and ultimately the response to site cloaking by search engine providers. Using a custom crawler, called Dagger, we track both popular search terms (e.g., as identified by Google, Alexa and Twitter) and targeted keywords (focused on pharmaceutical products) for over five months, identifying when distinct results were provided to crawlers and browsers. We further track the lifetime of cloaked search results as well as the sites they point to, demonstrating that cloakers can expect to maintain their pages in search results for several days on popul...
David Y. Wang, Stefan Savage, Geoffrey M. Voelker