500,000 PubMed abstracts. However, less than 50 documents are relevant for most queries. Applying scoring to all 500,000 abstracts would create a lot of noise. In the first step, we refined the document set with a simple keyword search. For the second step, we developed two methods. The first method (Method 1) uses a heuristic scoring system that simply counts the number of verbs and their derived words, which are important to specify the function of a query gene or its product. The second method (Method 2) uses a machine learning technique to score documents.