The ADROIT system that we are developing allows automatic discourse analysis of information rich natural language texts extracted directly from the web. We use guidelines and rela...
One may need to build a statistical parser for a new language, using only a very small labeled treebank together with raw text. We argue that bootstrapping a parser is most promis...
Research on the discovery of terms from corpora has focused on word sequences whose recurrent occurrence in a corpus is indicative of their terminological status, and has not addr...
Most works related to unithood were conducted as part of a larger effort for the determination of termhood. Consequently, the number of independent research that study the notion ...
We present a novel sentence reduction system for automatically removing extraneous phrases from sentences that are extracted from a document for summarization purpose. The system ...