Simulation studies are frequently used to evaluate new peer-to-peer searching techniques as well as existing techniques on new applications. Unless these studies are accurate in th...
Background: Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these...
A semantic caching scheme suitable for wrappers wrapping web sources is presented. Since the web sources have typically weaker querying capabilities than conventional databases, e...
This paper introduces a framework for clarifying and formalizing the duplicate document detection problem. Four distinct models are presented, each with a corresponding algorithm ...
User clicks on a URL in response to a query are extremely useful predictors of the URL's relevance to that query. Exact match click features tend to suffer from severe data s...
Huihsin Tseng, Longbin Chen, Fan Li, Ziming Zhuang...