Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
We address the problem of querying XML data over a P2P network. In P2P networks, the allowed kinds of queries are usually exact-match queries over file names. We discuss the exte...
With the rapid growth of XML-document traffic on the Internet, scalable content-based dissemination of XML documents to a large, dynamic group of consumers has become an important...
Because a hypermedia document is more complex than conventional text, it requires preparation with respect to two key aspects. First, the author begins to develop a "vision&q...
Takeshi Shimizu, Stephen W. Smoliar, John S. Borec...
We present here a method for automatically projecting structural information across translations, including canonical citation structure (such as chapters and sections), speaker i...