Abstract. This paper describes and compares the use of methods based on Ngrams (specifically trigrams and pentagrams), together with five features, to recognise the syntactic and s...
Query segmentation is essential to query processing. It aims to tokenize query words into several semantic segments and help the search engine to improve the precision of retrieva...
Chao Zhang, Nan Sun, Xia Hu, Tingzhu Huang, Tat-Se...
This paper presents a character segmentation algorithm for unconstrained cursive handwritten text. The transformation-based learning method and a simplified variation of it are us...
A patent always contains some images along with the text. Many text based systems have been developed to search the patent database. In this paper, we describe PATSEEK that is an ...
This paper presents a method of using mutual information to improve the recognition algorithm of unknown Chinese words, it can resolve the complexity of weight settings and the in...