This paper investigates the use of supervised clustering in order to create sets of categories for classi cation of documents. We use information from a pre-existing taxonomy in o...
The present study concentrates on the relation between sentence length (SL) and word length (WL) as a possible factor in text classification. The dependence of WL and SL is discuss...
Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...
Abstract. The eigenspectrum of a graph Laplacian encodes smoothness information over the graph. A natural approach to learning involves transforming the spectrum of a graph Laplaci...
To increase the relevancy of local patterns discovered from noisy relations, it makes sense to formalize error-tolerance. Our starting point is to address the limitations of state...