This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retr...
Recent models of natural language processing employ statistical reasoning for dealing with the ambiguity of formal grammars. In this approach, statistics, concerning the various li...
This paper is an exploration in a functional programming framework of isomorphisms between elementary data types (natural numbers, sets, finite functions, permutations binary deci...
Abstract. Recently, some non-regular subclasses of context-free grammars have been found to be efficiently learnable from positive data. In order to use these efficient algorithms ...
We develop a generic method for the review matching problem, which is to match unstructured text reviews to a list of objects, where each object has a set of attributes. To this e...
Nilesh N. Dalvi, Ravi Kumar, Bo Pang, Andrew Tomki...