Background: Synthesis of data from published human genetic association studies is a critical step in the translation of human genome discoveries into health applications. Although genetic ion studies account for a substantial proportion of the abstracts in PubMed, identifying them with standard queries is not always accurate or efficient. Further automating the literaturescreening process can reduce the burden of a labor-intensive and time-consuming traditional literature search. The Support Vector Machine (SVM), a well-established machine learning technique, has been successful in classifying text, including biomedical literature. The GAPscreener, VM-based software tool, can be used to assist in screening PubMed abstracts for human genetic association studies. Results: The data source for this research was the HuGE Navigator, formerly known as the HuGE Pub Lit database. Weighted SVM feature selection based on a keyword list obtained by the twoway z score method demonstrated the best ...
Wei Yu, Melinda Clyne, Siobhan M. Dolan, Ajay Yesu