— Open Source Software is computer software for which the source code is publicly open for inspection, modification, and redistribution. While research of a few, large, successful projects have provided insights into the nature and practices of the open source software community; it still leaves open the question about the thousands of other open source projects which are neither large or highly successful. In this paper, we describe a data set of SourceForge.net, the world’s largest open source software development site, which is available for research purposes; we discuss various data mining techniques that can be applied to the data and the type of research questions that can be answered. We apply a few of these techniques and provide analysis of the results.
Scott Christley, Gregory R. Madey