The World Wide Web provides a huge distributed web database. However, information in the web database is free formatted and unorganized. Traditional keyword-based retrieval approaches are no longer appropriate. In this paper, we consider a framework for constructing agents that can simulate the behavior of human browsing on the Internet. Given a specific target, such an agent will make use of existing search engines to navigate through the web to locate the sites containing the target information and extract them into a database. We refer to these types of agents as Personal Navigating Agents (PNA). Since the information service is domain specific, we shall first focus on those PNA that can retrieve people’s information on the web in this paper. In this particular experiment, given the name of a university, we shall extract the following information about its faculty: name, telephone number, fax number, email address and URL. We explore web page knowledge in two ways: First, we deve...
H. L. Wang, W. K. Shih, C. N. Hsu, Y. S. Chen, Y.