Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Abstract. Newistic is a web mining platform that collects and analyses documents crawled from the Internet. Although it currently processes news articles, it can be easily adapted ...
Named entity recognition aims at extracting named entities from unstructured text. A recent trend of named entity recognition is finding approximate matches in the text with respe...
Wei Wang 0011, Chuan Xiao, Xuemin Lin, Chengqi Zha...
This paper describes a language-independent, scalable system for both challenges of crossdocument co-reference: name variation and entity disambiguation. We provide system results...
We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle ...