Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed, and are used for collaboration by the communities of users, which are often large and also geographically distributed. There are major challenges involved in the efficient and reliable storage, fast processing, and extracting descriptive and predictive knowledge from this great mass of data. In this paper, we describe design principles and a service based software architecture of a novel infrastructure for distributed and high-performance data mining in Computational Grid environments. This architecture is designed and being implemented on top of the Globus 3.0 Alpha toolkit (it provides basic Grid services, such as authentication, information and resource management, etc.) and OGSA-DAI Grid Services (they provide basic access to Grid databases).