MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching

16 years 1 months ago

Download www.dit.unitn.it

The problem of measuring similarity between web pages arises in many important Web applications, such as search engines and Web directories. In this paper, we propose a novel neighbor-based similarity measure called MatchSim, which uses only the neighborhood structure of web pages. Technically, MatchSim recursively deﬁnes similarity between web pages by the average similarity of the maximum matching between their neighbors. Our method extends the traditional methods which simply count the numbers of common and/or diﬀerent neighbors. It also successfully overcomes a severe counterintuitive loophole in SimRank, due to its strict consistency with the intuitions of similarity. We give the computational complexity of MatchSim iteration. The accuracy of MatchSim is compared with others on two real datasets. The results show that the method performs best in most cases. Categories and Subject Descriptors: H.3.3 Information Search and Retrieval: Clustering; Information ﬁltering General T...

Zhenjiang Lin, Michael R. Lyu, Irwin King

Real-time Traffic

Average Similarity | CIKM 2009 | Database | Neighbor-based Similarity Measure | Web Pages |

claim paper

Post Info
More Details (n/a)

Added	26 May 2010
Updated	26 May 2010
Type	Conference
Year	2009
Where	CIKM
Authors	Zhenjiang Lin, Michael R. Lyu, Irwin King

Comments (0)

Sciweavers

MatchSim: a novel neighbor-based similarity measure with maximum neighborhood matching

Average Similarity | CIKM 2009 | Database | Neighbor-based Similarity Measure | Web Pages |

Explore & Download

Productivity Tools

Sciweavers