This paper snggests a method to align Korean-English parallel corpus. '1?he structural dissimilarity between Korean and Indo-European languages requires more flexible measures to evaluate the alignment candidates between the bilingual units than is used to handle the pairs of Indo-European languages. The flexible measure is intended to capture the dependency between bilingual items that can occur in different units according to different ordering rules. The proposed method to accomplish KoreanEnglish aligmnent takes phrases as an alignment unit that is a departure from the existing methods taking words as the unit. Phrasal alignment avoids the problem of alignment units and appease the problem of ordering mismatch. The parameters are estimated using the EM algorithm. The proposed alignment algorithm is based on dynamic programming. In the experimenl,s carried out on 253,000 English words and its Korean translations the proposed method achived 68.7% in accuracy at phrase level and...
Jung H. Shin, Young S. Han, Key-Sun Choi