Abstract—Open source software teams routinely develop complex software products in frequent-release settings with rather lightweight processes and project documentation. In this context project a major challenge for data collection is how to extract the relevant project management knowledge effectively and efficiently from a wide range of software project data sources, such as artifact versions, bug reports, and discussion forums. In this paper we introduce a framework and tool support for the semantic integration of data from a variety of data sources to facilitate efficient data collection, even in projects with frequent iterations. Based on data from real-world use cases in open source projects we compare the efficiency of the proposed framework with a traditional data warehouse approach. Major result is that the proposed approach can make data collection for project monitoring about 30% - 50% more efficient, in particular, in contexts where heterogeneous data sources change durin...