The World-Wide-Web is less agent-friendly than we might hope. Most information on the Web is presented in loosely structured natural language text with no agent-readable semantics. HTML annotations structure the displayof Web pages,but provide virtually no insight into their content. Thus, the designersof intelligent Web agentsneed to addressthe following questions: (1) To what extent can an agent understand information published at Web sites? (2) Is the agent's understanding sufficient to provide genuinely useful assistanceto users? (3) Is site-specific hand-coding necessary,or can the agent automatically extract information from unfamiliar Web sites? (4) What aspects of the Web facilitate this competence? In this paper we investigate these issueswith a case study using ShopBot, a fully-implemented, domainindependent comparison-shopping agent. Given the home pages of several online stores, ShopBot autonomously learnshow to shopat thosevendors. After learning, it is able to speed...
Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld