This paper proposes a method for creating a high quality collection of researchers’ homepages. The proposed method consists of three phases: rough filtering of the possible web pages, accurate evaluation of the web pages and precise selection of the correct homepages. For the rough filtering, the authors first define content-based keyword-lists, then generate filtering rules and relax the rules with heuristics. For the evaluation and the selection, they use a support vector machine with the feature sets derived from the content words of the web pages and propose an approach utilizing web-specific properties for improving the measures. Keyword Web Mining, Web Information Retrieval, Machine Learning, Web Page Classification