Themost costly aspect of gathering information over the Internet is that of transferring data over the networkto answer the user’s query. Wemaketwo contributions in this paperthat alleviate this problem.First, wepresent an algorithm for reducing the numberof information sources in an information gathering (IG) plan by reasoning with localized closed world (LCW) statements. In contrast to previous workon this problem, our algorithm can handle recursive information gathering plans that arise commonlyin practice. Second, wepresent a methodfor reducing the amount of networktraffic generated while executing an information gathering plan by reordering the sequence in whichqueries are sent to remoteinformation sources. Wewill explain whya direct application of traditional distributed database methodsto this problemdoes not work, and present a novel and cheap wayof adorning source descriptions to assist in ordering the queries.
Marc Friedman, Daniel S. Weld