This paper describes a validating XML parsing method based on deterministic finite state automata (DFA). XML parsing and validation is performed by a schema-specific XML parser th...
Document structure analysis can be regarded as a syntactic analysis problem. The order and containment relations among the physical or logical components of a document page can be...
We present Minimum Bayes-Risk (MBR) decoding for statistical machine translation. This statistical approach aims to minimize expected loss of translation errors under loss functio...
We present an approach for learning part-of-speech distinctions by induction over the lexicon of the Cyc knowledge base. This produces good results (74.6%) using a decision tree t...