The recent availability of large collections of text such as the Google 1T 5-gram corpus (Brants and Franz, 2006) and the Gigaword corpus of newswire (Graff, 2003) have made it po...
We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instea...
This paper investigates the usefulness of sentence-internal prosodic cues in syntactic parsing of transcribed speech. Intuitively, prosodic cues would seem to provide much the sam...
Michelle L. Gregory, Mark Johnson, Eugene Charniak
Information extraction can be defined as the task of automatically extracting instances of specified classes or relations from text. We consider the case of using machine learni...
This paper describes a new program, correct, which takes words rejected by the Unix spell program, proposes a list of candidate corrections, and sorts them by probability. The pro...
Mark D. Kernighan, Kenneth Ward Church, William A....