Sciweavers

308 search results - page 57 / 62
» Syntactic Similarity of Web Documents
Sort
View
DASFAA
2007
IEEE
138views Database» more  DASFAA 2007»
14 years 2 months ago
An Original Semantics to Keyword Queries for XML Using Structural Patterns
XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need...
Dimitri Theodoratos, Xiaoying Wu
CIKM
2004
Springer
14 years 1 months ago
Exploiting hierarchical relationships in conceptual search
As the number of available Web pages grows, users experience increasing difficulty finding documents relevant to their interests. One of the underlying reasons for this is that mo...
Devanand Ravindran, Susan Gauch
SIGIR
2000
ACM
14 years 8 days ago
Evaluating evaluation measure stability
: This paper presents a novel way of examining the accuracy of the evaluation measures commonly used in information retrieval experiments. It validates several of the rules-of-thum...
Chris Buckley, Ellen M. Voorhees
ICAIL
2007
ACM
13 years 11 months ago
Essential deduplication functions for transactional databases in law firms
As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...
Jack G. Conrad, Edward L. Raymond
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 8 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar