Abstract. In this paper we present a system, DoLSuD, for the automatic discovery of relevant substructures in a document layout. DoLSuD, Document Layout Substructure Discovery, ext...
Complex documents stored in a flat or partially marked up file format require layout sensitive preprocessing before any natural language processing can be carried out on their tex...
SA_MetaMatch, a component of the Standards Advisor (SA), is designed to find relevant documents through matching indices of metadata and document content. The elements in the meta...
We propose a new text mining system which extracts characteristic contents from given documents. We define Key semantics as characteristic sub-structures of syntactic dependencie...