Natural languageprocessingNLP programsare confronted with various di culties in processing HTML and XML documents, and have the potential to produce better results if linguistic information is annotated in the source texts. We have therefore developed the Linguistic Annotation Language or LAL, which is an XML-compliant tag set for assisting natural language processing programs, and NLP tools such as parsers and machine translation programs which can accept LAL-annotated input. In addition, we have developed a LALannotation editor which allows users to annotate documents graphically without seeing tags. Further, we have conducted an experiment to check the translation quality improvement by using LAL annotation.
Hideo Watanabe, Katashi Nagao, Michael C. McCord,