We are proposing a simple, but efficient basic approach for a number of multilingual and cross-lingual language technology applications that are not limited to the usual two or th...
We propose a multi-document generic summarization model based on the budgeted median problem. Our model selects sentences to generate a summary so that every sentence in the docum...
How can it be said that texts are "near" or "distant" from one another? Are different texts by a single author more similar than texts by different authors? To...
Abstract. We describe a semantic clustering method designed to address shortcomings in the common bag-of-words document representation for functional semantic classification tasks....