Sciweavers

WWW
2005
ACM
15 years 7 days ago
Duplicate detection in click streams
We consider the problem of finding duplicates in data streams. Duplicate detection in data streams is utilized in various applications including fraud detection. We develop a solu...
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi
WWW
2005
ACM
15 years 7 days ago
Object-level ranking: bringing order to Web objects
In contrast with the current Web search methods that essentially do document-level ranking and retrieval, we are exploring a new paradigm to enable Web search at the object level....
Zaiqing Nie, Yuanzhi Zhang, Ji-Rong Wen, Wei-Ying ...
WWW
2005
ACM
15 years 7 days ago
On the lack of typical behavior in the global Web traffic network
We offer the first large-scale analysis of Web traffic based on network flow data. Using data collected on the Internet2 network, we constructed a weighted bipartite clientserver ...
Mark Meiss, Filippo Menczer, Alessandro Vespignani
WWW
2005
ACM
15 years 7 days ago
A uniform approach to accelerated PageRank computation
In this note we consider a simple reformulation of the traditional power iteration algorithm for computing the stationary distribution of a Markov chain. Rather than communicate t...
Frank McSherry
WWW
2005
ACM
15 years 7 days ago
Improving understanding of website privacy policies with fine-grained policy anchors
Website privacy policies state the ways that a site will use personal identifiable information (PII) that is collected from fields and forms in web-based transactions. Since these...
Stephen E. Levy, Carl Gutwin
WWW
2005
ACM
15 years 7 days ago
Algorithmic detection of semantic similarity
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...
WWW
2005
ACM
15 years 7 days ago
Automatic identification of user goals in Web search
There have been recent interests in studying the "goal" behind a user's Web query, so that this goal can be used to improve the quality of a search engine's re...
Uichin Lee, Zhenyu Liu, Junghoo Cho
WWW
2005
ACM
15 years 7 days ago
Three-level caching for efficient query processing in large Web search engines
Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple...
Xiaohui Long, Torsten Suel
WWW
2005
ACM
15 years 7 days ago
Building adaptable and reusable XML applications with model transformations
We present an approach in which the semantics of an XML language is defined by means of a transformation from an XML document model (an XML schema) to an application specific mode...
Ivan Kurtev, Klaas van den Berg
WWW
2005
ACM
15 years 7 days ago
Opinion observer: analyzing and comparing opinions on the Web
The Web has become an excellent source for gathering consumer opinions. There are now numerous Web sites containing such opinions, e.g., customer reviews of products, forums, disc...
Bing Liu, Minqing Hu, Junsheng Cheng