We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical fo...
Random forests are one of the best performing methods for constructing ensembles. They derive their strength from two aspects: using random subsamples of the training data (as in b...
Zero-days attacks are one of the most dangerous threats against computer networks. These, by definition, are attacks never seen before. Thus, defense tools based on a database of ...
Number and date expressions are essential information items in corpora and therefore play a major role in various text mining applications. However, so far number expressions were ...
Abstract. We derive upper and lower bounds for some statistical estimation problems. The upper bounds are established for the Gibbs algorithm. The lower bounds, applicable for all ...