Information Retrieval (IR) is a major component in many of our daily activities, with perhaps its most prominent role manifested in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which concedes some inherent disadvantages. We believe that natural language (NL) is a more user-oriented, context-preservative and intuitive mechanism for web search. In this paper, we explore shallow NLP techniques to support a range of NL queries over an existing keyword-based engine. We present JASE, a web application enveloping the Google search engine, which performs web searches by decomposing input NL queries and generating new queries that are more suitable for the search engine. By using some of Google’s syntactic operators and filters, it creates “clever” queries to improve precision. A preliminary evaluation was conducted to test JASE’s accuracy, and results have been encouraging. We conclude that the NL model has potential to not only...
Alex Penev, Raymond K. Wong