Sciweavers

2736 search results - page 448 / 548
» Indexing uncertain data
Sort
View
WWW
2010
ACM
16 years 1 months ago
A pattern tree-based approach to learning URL normalization rules
Duplicate URLs have brought serious troubles to the whole pipeline of a search engine, from crawling, indexing, to result serving. URL normalization is to transform duplicate URLs...
Tao Lei, Rui Cai, Jiang-Ming Yang, Yan Ke, Xiaodon...
ICS
2009
Tsinghua U.
16 years 25 days ago
High-performance regular expression scanning on the Cell/B.E. processor
Matching regular expressions (regexps) is a very common workload. For example, tokenization, which consists of recognizing words or keywords in a character stream, appears in ever...
Daniele Paolo Scarpazza, Gregory F. Russell
157
Voted
MM
2009
ACM
137views Multimedia» more  MM 2009»
16 years 17 days ago
Lightweight web image reranking
Web image search is inspired by text search techniques; it mainly relies on indexing textual data that surround the image file. But retrieval results are often noisy and image pro...
Adrian Popescu, Pierre-Alain Moëllic, Ioannis...
189
Voted
PETRA
2009
ACM
16 years 17 days ago
Towards faster activity search using embedding-based subsequence matching
Event search is the problem of identifying events or activity of interest in a large database storing long sequences of activity. In this paper, our topic is the problem of identi...
Panagiotis Papapetrou, Paul Doliotis, Vassilis Ath...
SIGIR
2009
ACM
16 years 17 days ago
Addressing morphological variation in alphabetic languages
The selection of indexing terms for representing documents is a key decision that limits how effective subsequent retrieval can be. Often stemming algorithms are used to normaliz...
Paul McNamee, Charles K. Nicholas, James Mayfield