In the paper we investigate the impact of data size on a Word Sense Disambiguation task (WSD). We question the assumption that the knowledge acquisition bottleneck, which is known...
Numerous genomic annotations are currently stored in different web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integ...
Marco Masseroli, Andrea Stella, Natalia Meani, Myr...
We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report ...
Kevin Gimpel, Nathan Schneider, Brendan O'Connor, ...
Active Learning (AL) is a selective sampling strategy which has been shown to be particularly cost-efficient by drastically reducing the amount of training data to be manually ann...
Managing large-scale time series databases has attracted significant attention in the database community recently. Related fundamental problems such as dimensionality reduction, tr...