This paper presents a method of automatically creating hypermedia documents from conventional transcriptions of television programs. Using parallel text alignment techniques, the ...
We demonstrate an approach for inducing a tagger for historical languages based on existing resources for their modern varieties. Tags from Present Day English source text are pro...
This paper describes a language independent method for alignment of parallel texts that makes use of homograph tokens for each pair of languages. In order to filter out tokens tha...
Abundant Chinese paraphrasing resource on Internet can be attained from different Chinese translations of one foreign masterpiece. Paraphrases corpus is the corpus that includes s...
In this paper, we will present an efficient method to compute the co-occurrence counts of any pair of substring in a parallel corpus, and an algorithm that make use of these count...