Systematic reviews are only as good as the evidence they are based on. It is important, therefore, that users of systematic reviews know how much confidence they can place in the conclusions and recommendations arising from such reviews. In this paper we present an overview of some of the most influential systems for assessing the quality of individual primary studies and for grading the overall strength of a body of evidence. We also present an example of the use of such systems based on a systematic review of empirical studies of agile software development. Our findings suggest that the systems used in other disciplines for grading the strength of evidence for and reporting of systematic reviews, especially those that take account of qualitative and observational studies are of particular relevance for software engineering. Categories and Subject Descriptors D.3 [Software Engineering] General Terms Management, Measurement, Experimentation, Theory. Keywords Systematic Review, Quality...