Human Performance on Clustering Web Pages: A Preliminary Study

15 years 10 months ago

Download www.research.rutgers.edu

With the increase in information on the World Wide Web it has become difficult to quickly find desired information without using multiple queries or using a topic-specific search engine. One way to help in the search is by grouping HTML pages together that appear in some way to be related. In order to better understand this task, we performed an initial study of human clustering of web pages, in the hope that it would provide some insight into the difficulty of automating this task. Our results show that subjects did not cluster identically; in fact, on average, any two subjects had little similarity in their web-page clusters. We also found that subjects generally created rather small clusters, and those with access only to URLs created fewer clusters than those with access to the full text of each web page. Generally the overlap of documents between clusters for any given subject increased when given the full text, as did the percentage of documents clustered. When analyzing individ...

Sofus A. Macskassy, Arunava Banerjee, Brian D. Dav

Real-time Traffic

Data Mining | KDD 1998 | Topic-specific Search Engine | Web Pages | Web-page Clusters |

claim paper

» Clustering the Chilean Web

» On the Evolution of Clusters of NearDuplicate Web Pages

» Effectiveness of web page classification on finding list answers

» Cluster Based Personalized Search

» Web page classification on child suitability

» Finding the boundaries of information resources on the web

» Basic issues on the processing of web queries

» PartitionBased Parallel PageRank Algorithm

Post Info
More Details (n/a)

Added	06 Aug 2010
Updated	06 Aug 2010
Type	Conference
Year	1998
Where	KDD
Authors	Sofus A. Macskassy, Arunava Banerjee, Brian D. Davison, Haym Hirsh

Comments (0)

Sciweavers

Human Performance on Clustering Web Pages: A Preliminary Study

Data Mining | KDD 1998 | Topic-specific Search Engine | Web Pages | Web-page Clusters |

Explore & Download

Productivity Tools

Sciweavers