"Word" is difficult to define in the languages that do not exhibit explicit word boundary, such as Thai. Traditional methods on defining words for this kind of languages...
Abstract. Machine learning approaches in natural language processing often require a large annotated corpus. We present a complementary approach that utilizes expert knowledge to o...
Patent documents contain important research results. However, they are lengthy and rich in technical terminology such that it takes a lot of human efforts for analyses. Automatic...
Abstract. Several methods were proposed to reduce the number of instances (vectors) in the learning set. Some of them extract only bad vectors while others try to remove as many in...