This paper studies a text mining problem, comparative sentence mining. A comparative sentence expresses an ordering relation between two sets of entities with respect to some comm...
A novel technique for maximum "a posteriori" (MAP) adaptation of maximum entropy (MaxEnt) and maximum entropy Markov models (MEMM) is presented. The technique is applied...
A thesaurus and an ontology provide a set of structured terms, phrases, and metadata, often in a hierarchical arrangement, that may be used to index, search, and mine documents. W...
CzEng 0.9 is the third release of a large parallel corpus of Czech and English. For the current release, CzEng was extended by significant amount of texts from various types of so...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...