Effective retrieval of court decisions is important. Automatically identifying legal concepts in the decision texts would be very helpful. In this paper we investigate how a stat...
We present a system for searching and classifying U.S. patent documents, based on Inquery. Patents are distributed through hundreds of collections, divided up by general area. The...
We present an annotation project for two subsets of the Enron email corpus. The first is a subset of the UC Berkeley Enron Email Analysis Project and the second consists of a port...
Jade Goldstein, Andres Kwasinksi, Paul Kingsbury, ...
As a side effect of e-marketing strategy the number of spam e-mails is rocketing, the time and cost needed to deal with spam as well. Spam filtering is one of the most difficult t...
We introduce a new type of Self-Organizing Map (SOM) to navigate in the Semantic Space of large text collections. We propose a "hyperbolic SOM" (HSOM) based on a regular...