We report on our work to automatically build a corpus of instructional text annotated with lexical semantics information. We have coupled the parser LCFLEX with a lexicon and onto...
We report here empirical results of a series of studies aimed at automatically predicting information quality in news documents. Multiple research methods and data analysis techni...
Rong Tang, Kwong Bor Ng, Tomek Strzalkowski, Paul ...
This paper investigates bootstrapping for statistical parsers to reduce their reliance on manually annotated training data. We consider both a mostly-unsupervised approach, co-tra...
Mark Steedman, Rebecca Hwa, Stephen Clark, Miles O...
We introduce two probabilistic models that can be used to identify elementary discourse units and build sentence-level discourse parse trees. The models use syntactic and lexical ...
Leximancer is a software system for performing conceptual analysis of text data in a largely language independent manner. The system is modelled on Content Analysis and provides u...
Automatic restoration of punctuation from unpunctuated text has application in improving the fluency and applicability of speech recognition systems. We explore the possibility t...
Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Among sequence labeling...
Large-scale parsing is still a complex and timeconsuming process, often so much that it is infeasible in real-world applications. The parsing system described here addresses this ...
We present CarmelTC, a novel hybrid text classification approach for automatic essay grading. Our evaluation demonstrates that the hybrid CarmelTC approach outperforms two “bag...