We present a model that improves entity entity link modeling in a mixed membership stochastic block model, by jointly modeling links with text about the entities that are linked i...
The purpose of this paper is to investigate Enterprise Resource Planning (ERP) system outcomes in the context of small and medium-sized enterprises (SMEs). Most of the former rese...
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Machine learning often relies on costly labeled data, and this impedes its application to new classification and information extraction problems. This has motivated the developme...
Abstract. This paper presents a language-independent Multilingual Document Clustering (MDC) approach on comparable corpora. Named entites (NEs) such as persons, locations, organiza...