Today, valuable business information is increasingly stored as unstructured data (documents, emails, etc.). For example, documents exchanged between business partners capture info...
Constrained pattern mining extracts patterns based on their individual merit. Usually this results in far more patterns than a human expert or a machine learning technique could m...
In this paper, we address a relatively new and interesting text categorization problem: classify a political blog as either liberal or conservative, based on its political leaning...
Many websites have large collections of pages generated dynamically from an underlying structured source like a database. The data of a category are typically encoded into similar...
Background: Growing interest in the application of natural language processing methods to biomedical text has led to an increasing number of corpora and methods targeting protein-...
Sampo Pyysalo, Antti Airola, Juho Heimonen, Jari B...