Sciweavers

289 search results - page 11 / 58
» Postal Address Detection from Web Documents
Sort
View
IPPS
2002
IEEE
14 years 26 days ago
On Reliable and Scalable Peer-to-Peer Web Document Sharing
We propose a peer-to-peer Web document sharing technique, called “Browsers-Aware Proxy Server”. In this design, a proxy server connecting to a group of networked clients maint...
Li Xiao, Xiaodong Zhang, Zhichen Xu
DIMVA
2009
13 years 9 months ago
Browser Fingerprinting from Coarse Traffic Summaries: Techniques and Implications
We demonstrate that the browser implementation used at a host can be passively identified with significant precision and recall, using only coarse summaries of web traffic to and f...
Ting-Fang Yen, Xin Huang, Fabian Monrose, Michael ...
DIS
2007
Springer
14 years 2 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
WEBI
2005
Springer
14 years 1 months ago
Automated Metadata and Instance Extraction from News Web Sites
In this paper, we present automated techniques for extracting metadata instance information by organizing and mining a set of news Web sites. We develop algorithms that detect and...
Srinivas Vadrevu, Saravanakumar Nagarajan, Fatih G...
IJWET
2008
98views more  IJWET 2008»
13 years 8 months ago
Warehousing complex data from the web
: Data warehousing and Online Analytical Processing (OLAP) technologies are now moving onto handling complex data that mostly originate from the web. However, integrating such data...
Omar Boussaid, Jérôme Darmont, Fadila...