Sciweavers

52 search results - page 3 / 11
» Finding near-duplicate web pages: a large-scale evaluation o...
Sort
View
TSE
2010
161views more  TSE 2010»
13 years 5 months ago
Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit-State Model Checking
— Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact the usability of web applications. Current tools for web-page vali...
Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, ...
KDD
2007
ACM
189views Data Mining» more  KDD 2007»
14 years 7 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Shubin Zhao, Jonathan Betz
SIGIR
2008
ACM
13 years 7 months ago
Classifiers without borders: incorporating fielded text from neighboring web pages
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
Xiaoguang Qi, Brian D. Davison
AIRWEB
2008
Springer
13 years 9 months ago
Identifying web spam with user behavior analysis
Combating Web spam has become one of the top challenges for Web search engines. State-of-the-art spam detection techniques are usually designed for specific known types of Web spa...
Yiqun Liu, Rongwei Cen, Min Zhang, Shaoping Ma, Li...
WWW
2008
ACM
14 years 8 months ago
As we may perceive: finding the boundaries of compound documents on the web
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Pavel Dmitriev