We give a survey over the INEX initiative, which focuses on the evaluation of content -based access to XML documents. First, we describe the test setting and the various tracks of...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector r...
Rowena Chau, Ah Chung Tsoi, Markus Hagenbuchner, V...
Theme network is a semantic network of document specific themes. So far Natural Language Processing (NLP) research patronized much of topic based summarizer system, unable to captu...
Reciprocity is a pervasive concept that plays an important role in governing people’s behavior, judgments, and thus their social interactions. In this paper we present an analys...