Abstract. Text documents have sparse data spaces, and nearest neighbors may belong to different classes when using current existing proximity measures to describe the correlation ...
In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology cons...
This paper reports on the INRIA group’s approach to XML mining while participating in the INEX XML Mining track 2005. We use a flexible representation of XML documents that allo...
Anne-Marie Vercoustre, Mounir Fegas, Saba Gul, Yve...
In distributed data mining models, adopting a flat node distribution model can affect scalability. To address the problem of modularity, flexibility and scalability, we propose...
This paper presents a methodology for learning taxonomic relations from a set of documents that each explain one of the concepts. Three different feature extraction approaches with...