Sciweavers

2189 search results - page 199 / 438
» Webbed documents
Sort
View
119
Voted
SIGIR
2010
ACM
14 years 9 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
135
Voted
ICDE
2007
IEEE
170views Database» more  ICDE 2007»
15 years 6 months ago
A UML Profile for Core Components and their Transformation to XSD
In business-to-business e-commerce, traditional electronic data interchange (EDI) approaches such as UN/EDIFACT have been superseded by approaches like web services and ebXML. Nev...
Christian Huemer, Philipp Liegl
119
Voted
DIS
2007
Springer
15 years 8 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
109
Voted
SEMWEB
2007
Springer
15 years 8 months ago
HealthFinland - Finnish Health Information on the Semantic Web
This paper shows how semantic web techniques can be applied to solving problems of distributed content creation, discovery, linking, aggregation, and reuse in health information po...
Eero Hyvönen, Kim Viljanen, Osma Suominen
133
Voted
FAST
2009
15 years 10 days ago
The Case for Browser Provenance
In our increasingly networked world, web browsers are important applications. Originally an interface tool for accessing distributed documents, browsers have become ubiquitous, in...
Daniel W. Margo, Margo I. Seltzer