Standardized Evaluation Method for Web Clustering Results

16 years 3 days ago

Download www.danielcrabtree.com

Finding a set of web pages relevant to a user’s information goal is difﬁcult due to the enormous size of the Internet. Search engines are able to ﬁnd a set of pages that match the user’s query, but reﬁning the results is still difﬁcult and time consuming. Web clustering addresses this problem by presenting the user with clusters of related pages as reﬁnement options. Many clustering algorithms have been developed and researchers need to be able to compare their effectiveness. Different algorithms produce clusterings with different characteristics: coarse or ﬁne granularity, disjoint or overlapping, ﬂat or hierarchical. The lack of a standardized web clustering evaluation method that can evaluate clusterings with different characteristics has led to incomparable research and results. This paper solves this by introducing a new structure for deﬁning general ideal clusterings and new measurements for evaluating clusterings with different characteristics by comparing t...

Daniel Crabtree, Xiaoying Gao, Peter Andreae

Real-time Traffic