In this paper we describe the current state of a new Japanese lexical resource: the Hinoki treebank. The treebank is built from dictionary definitions, examples and news text, and ...
Detecting idioms in a sentence is important to sentence understanding. This paper discusses the linguistic knowledge for idiom detection. The challenges are that idioms can be ambi...
Only very recently have Vietnamese researchers begun to be involved in the domain of Natural Language Processing (NLP). As there does not exist any published work in formal linguis...
A lack of surveillance system infrastructure in the Asia-Pacific region is seen as hindering the global control of rapidly spreading infectious diseases such as the recent avian H5...
Nigel Collier, Ai Kawazoe, Lihua Jin, Mika Shigema...
When building a new spoken dialogue application, large amounts of domain specific data are required. This paper addresses the issue of generating in-domain training data when litt...
In a 12-month project we have developed a new, register-diverse, 55-million-word bilingual corpus--the New Corpus for Ireland (NCI)--to support the creation of a new English-to-Iri...
Web searchers reformulate their queries, as they adapt to search engine behavior, learn more about a topic, or simply correct typing errors. Automatic query rewriting can help user...
Rosie Jones, Kevin Bartz, Pero Subasic, Benjamin R...