This paper describes a theoretical approach on data mining, information classifying and a global overview of our OntoExtractor application, concerning the analysis of incoming data flow and generate metadata structures. In order to help the user to classify a big and varied group of data, our proposal is to use fuzzy-based techniques to compare and classify the data. Before comparing the elements, the incoming flow of information has to be converted into a common structured format like XML. With those structured documents now we can compare and cluster the various data and generate a metadata structure about this data repository.