Existing HTML mark-up is used only to indicate the structure and lay-out of documents, but not the document semantics. As a result web documents are difficult to be semantically p...
One of the Web information Retrieval (IR) problems these days is to identify redundant information that exist in (replicated) Web documents. These documents can easily be found in...
Replicating Web documents at a worldwide scale can help reduce user-perceived latency and wide-area network traffic. This paper presents the design of Globule, a platform that aut...
Limitation in display size and resolution on mobile devices is one of the main obstacles for wide-spread use of web applications in a wireless environment. Web pages are often too ...
Most text analysis is designed to deal with the concept of a “document”, namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the ...