At the International Research and Educational Institute for Integrated Medical Sciences (IREIIMS) project, we are collecting complete medical data sets to determine relationships b...
We present an annotation project for two subsets of the Enron email corpus. The first is a subset of the UC Berkeley Enron Email Analysis Project and the second consists of a port...
Jade Goldstein, Andres Kwasinksi, Paul Kingsbury, ...
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Effective retrieval of court decisions is important. Automatically identifying legal concepts in the decision texts would be very helpful. In this paper we investigate how a stat...
We propose a semi-supervised model which segments and annotates images using very few labeled images and a large unaligned text corpus to relate image regions to text labels. Give...