Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

117

WWW
2008
ACM

100views Internet Technology» more WWW 2008»

A larger scale study of robots.txt

16 years 7 months ago

A larger scale study of robots.txt

Download www2008.org

A website can regulate search engine crawler access to its content using the robots exclusion protocol, specified in its robots.txt file. The rules in the protocol enable the site to allow or disallow part or all of its content to certain crawlers, resulting in a favorable or unfavorable bias towards some of them. A 2007 survey on the robots.txt usage of about 7,593 sites found some evidence of such biases, the news of which led to widespread discussions on the web. In this paper, we report on our survey of about 6 million sites. Our survey tries to correct the shortcomings of the previous survey and shows the lack of any significant preferences towards any particular search engine. Categories and Subject Descriptors: H.3.3 [Information Search and Retrieval]: Search Process General Terms: Experimentation, Measurement

Santanu Kolay

Real-time Traffic

Internet Technology | Particular Search Engine | Robots Exclusion Protocol | Search Engine Crawler | WWW 2008 |

claim paper

Related Content

» Multibondic cluster algorithm for finitesize scaling studies of critical phenomena

» A Longitudinal Study of SmallTime Scaling Behavior of Internet Traffic

» A LargeScale Study of FileSystem Contents

» Learning Policies for Partially Observable Environments Scaling Up

» Applying Database Support for Large Scale Data Driven Science in Distributed Environments

» Adversarial Deletion in a ScaleFree Random Graph Process

» Data Dependent Circuit Design A Case Study

» Scale effects in steering law tasks

» A comparative study of interaction metaphors for largescale displays

Post Info
More Details (n/a)

Added	21 Nov 2009
Updated	21 Nov 2009
Type	Conference
Year	2008
Where	WWW
Authors	Santanu Kolay

Comments (0)