Empirical testing is a very popular evaluation method for the development of intelligent systems. Here, previously solved problems with correct solutions are given as cases to the system. Validity is tested by comparing the expected results with the derived solutions. Besides classic forms of boolean testing of occurring solutions more refined methods are required for a thorough evaluation of real world knowledge systems. We present extended precision and recall functions for interactive knowledge systems that are generalizations of the existing measures. Additionally, we propose a visualization method for inspecting the validation result for interactive systems. A case study with a second-opinion system from the medical domain demonstrates the usefulness of the approach.