Much information that could help solve and prevent crimes is never gathered because the reporting methods available to citizens and law enforcement personnel are not optimal. Dete...
We introduce a new set of tools for working with web-scale N-gram data. These tools lower the barrier for working with web-scale text, and create a new platform for acquiring larg...
Dekang Lin, Kenneth Ward Church, Heng Ji, Satoshi ...
Much information over the Internet is expressed by natural languages. The management of linguistic information involves an operation of comparison and aggregation. In this paper, ...
Documents can be assigned keywords by frequency analysis of the terms found in the document text, which arguably is the primary source of knowledge about the document itself. By in...
Anette Hulth, Jussi Karlgren, Anna Jonsson, Henrik...