This paper reviews the main innovations of XML and considers their impact on the editing techniques for structured documents. Namespaces open the way to compound documents; well-f...
Disambiguating person names in a set of documents (such as a set of web pages returned in response to a person name) is a key task for the presentation of results and the automatic...
Document representations can rapidly become unwieldy if they try to encapsulate all possible document properties, ranging tract structure to detailed rendering and layout. We pres...
—A handwritten Tibetan database, MRG-OHTC, is presented to facilitate the research of online handwritten Tibetan character recognition. The database contains 910 Tibetan characte...
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...