In TREC 2004, the Database and Information System Lab (DBIS) at University of Illinois at Chicago (UIC) participates in the robust track, which is a traditional ad hoc retrieval task. The emphasis is based on average effectiveness as well as individual topic effectiveness. In our system, noun phrases in the query are identified and classified into 4 types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in a phrase are within a window of a certain size. The window sizes for different types of phrases are different. We consider phrases to be more important than individual terms. As a consequence, documents in response to a query are ranked with matching phrases given a higher priority. WordNet is used to disambiguate word senses. Whenever the sense of a query term is determined, its synonyms, hyponyms, words from its definition and its compound concepts are considered for possible additions to the query. The newly added...
Shuang Liu, Chaojing Sun, Clement T. Yu