Grid Computing brought the promise of making high-performance computing cheaper and more easily available than traditional supercomputing platforms. Such a promise was very well received by the data mining (DM) community, as DM applications typically process very large datasets and are thus very resource intensive. However, since the Grid is very dynamic and parallel data mining is prone to load unbalancing, obtaining good data mining performance on the Grid is hard. It typically requires for the scheduler to understand the inner works of the application, bringing two related problems. First, good Grid schedulers tend to be very specialized in the application they target. Second, changing the application may require changing the scheduler, what may be specially challenging when there is no clear separation between the application and the scheduler code. We pose and evaluate a knowledge-based approach that provides abstractions to the DM developer and optimizes at runtime the DM applic...