Sciweavers

139 search results - page 11 / 28
» An Approach to Identify Duplicated Web Pages
Sort
View
SCFBM
2008
143views more  SCFBM 2008»
13 years 7 months ago
LSID Tester, a tool for testing Life Science Identifier resolution services
Background: Life Science Identifiers (LSIDs) are persistent, globally unique identifiers for biological objects. The decentralised nature of LSIDs makes them attractive for identi...
Roderic D. M. Page
DOCENG
2009
ACM
14 years 2 months ago
Web document text and images extraction using DOM analysis and natural language processing
: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...
Parag Mulendra Joshi, Sam Liu
HT
2005
ACM
14 years 1 months ago
As we may perceive: inferring logical documents from hypertext
In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...
Pavel Dmitriev, Carl Lagoze, Boris Suchkov
DIMVA
2011
12 years 11 months ago
Escape from Monkey Island: Evading High-Interaction Honeyclients
Abstract. High-interaction honeyclients are the tools of choice to detect malicious web pages that launch drive-by-download attacks. Unfortunately, the approach used by these tools...
Alexandros Kapravelos, Marco Cova, Christopher Kru...
CMS
2010
150views Communications» more  CMS 2010»
13 years 8 months ago
Throwing a MonkeyWrench into Web Attackers Plans
Abstract. Client-based attacks on internet users with malicious web pages represent a serious and rising threat. Internet Browsers with enabled active content technologies such as ...
Armin Büscher, Michael Meier, Ralf Benzmü...