Many modern natural language-processing applications utilize search engines to locate large numbers of Web documents or to compute statistics over the Web corpus. Yet Web search engines are designed and optimized for simple human queries--they are not well suited to support such applications. As a result, these applications are forced to issue millions of successive queries resulting in unnecessary search engine load and in slow applications with limited scalability. In response, this paper introduces the Bindings Engine (be), which supports queries containing typed variables and string-processing functions. For example, in response to the query "powerful noun " be will return all the nouns in its index that immediately follow the word "powerful", sorted by frequency. In response to the query "Cities such as ProperNoun(Head( NounPhrase ))", be will return a list of proper nouns likely to be city names. be's novel neighborhood index enables it to do s...
Michael J. Cafarella, Oren Etzioni