Research on information extraction (IE) seeks to distill relational tuples from natural language text, such as the contents of the WWW. Most IE work has focussed on identifying st...
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...
The lack of parallel corpora and linguistic resources for many languages and domains is one of the major obstacles for the further advancement of automated translation. A possible...
Marcis Pinnis, Radu Ion, Dan Stefanescu, Fangzhong...
One aspect in which retrieving named entities is different from retrieving documents is that the items to be retrieved – persons, locations, organizations – are only indirect...
Wikipedia provides an interesting amount of text for more than hundred languages. This also includes languages where no reference corpora or other linguistic resources are easily ...