Automatic Term recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies ...
With the development of variable-data-driven digital presses where each document printed is potentially unique there is a need for pre-press optimization to identify material that...
Alexander J. Macdonald, David F. Brailsford, John ...
This paper presents an empirical study for improving the performance of text chunking. We focus on two issues: the problem of selecting feature spaces, and the problem of alleviat...
Large-scale information processing applications must rapidly search through high volume streams of structured and unstructured textual data to locate useful information. Content-ba...
This paper presents a lightweight method for unsupervised extraction of paraphrases from arbitrary textual Web documents. The method differs from previous approaches to paraphrase...