Leximancer is a software system for performing conceptual analysis of text data in a largely language independent manner. The system is modelled on Content Analysis and provides unsupervised and supervised analysis using seeded concept classifiers. Unsupervised ontology discovery is a key component. 1 Method The strategy used for conceptual mapping of text involves abstracting families of words to thesaurus concepts. These concepts are then used to classify text at a resolution of several sentences. The resulting concept tags are indexed to provide a document exploration environment for the user. A smaller number of simple concepts can index many more complex relationships by recording co-occurrences, and complex systems approaches can be applied to these systems of agents. To achieve this, several novel algorithms were developed: a learning optimiser for automatically selecting, learning, and adapting a concept from the word usage within the text, and an asymmetric scaling process f...
Andrew E. Smith