Sciweavers

1353 search results - page 218 / 271
» Text Indexing with Errors
Sort
View
CN
1998
207views more  CN 1998»
13 years 8 months ago
The Anatomy of a Large-Scale Hypertextual Web Search Engine
In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the...
Sergey Brin, Lawrence Page
IJDAR
2010
110views more  IJDAR 2010»
13 years 6 months ago
Locating and parsing bibliographic references in HTML medical articles
The set of references that typically appear toward the end of journal articles is sometimes, though not always, a field in bibliographic (citation) databases. But even if referenc...
Jie Zou, Daniel X. Le, George R. Thoma
ACL
2006
13 years 9 months ago
An Effective Two-Stage Model for Exploiting Non-Local Dependencies in Named Entity Recognition
This paper shows that a simple two-stage approach to handle non-local dependencies in Named Entity Recognition (NER) can outperform existing approaches that handle non-local depen...
Vijay Krishnan, Christopher D. Manning
WWW
2008
ACM
14 years 9 months ago
Improving relevance judgment of web search results with image excerpts
Current web search engines return result pages containing mostly text summary even though the matched web pages may contain informative pictures. A text excerpt (i.e. snippet) is ...
Zhiwei Li, Shuming Shi, Lei Zhang
KDD
2008
ACM
183views Data Mining» more  KDD 2008»
14 years 8 months ago
De-duping URLs via rewrite rules
A large fraction of the URLs on the web contain duplicate (or near-duplicate) content. De-duping URLs is an extremely important problem for search engines, since all the principal...
Anirban Dasgupta, Ravi Kumar, Amit Sasturkar