Sciweavers

708 search results - page 23 / 142
» Identifying Content Blocks from Web Documents
Sort
View
WWW
2009
ACM
14 years 9 months ago
User-centric content freshness metrics for search engines
In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...
Ali Dasdan, Xinh Huynh
SIGMOD
2000
ACM
85views Database» more  SIGMOD 2000»
14 years 29 days ago
Finding Replicated Web Collections
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...
ICDCS
2000
IEEE
14 years 1 months ago
On Supporting Weakly-Connected Browsing in a Mobile Web Environment
A mobile environment is weakly-connected, characterized by low communication bandwidth and poor connectivity. Conventional paradigm for sur ng mobile web documents is ine ective s...
Antonio Si, Hong Va Leong, Dennis McLeod, Stanley ...
WWW
2009
ACM
14 years 9 months ago
Estimating web site readability using content extraction
Nowadays, information is primarily searched on the WWW. From a user perspective, the readability is an important criterion for measuring the accessibility and thereby the quality ...
Thomas Gottron, Ludger Martin
LISA
2003
13 years 10 months ago
DryDock: A Document Firewall
Auditing a web site’s content is an arduous task. For any given page on a web server, system administrators are often ill-equipped to determine who created the document, why itâ...
Deepak Giridharagopal