Author identification models fall into two major categories according to the way they handle the training texts: profile-based models produce one representation per author while in...
The classification of encrypted traffic on the fly from network traces represents a particularly challenging application domain. Recent advances in machine learning provide the opp...
Biomedical literature is an important source of information for chemical compounds. However, different representations and nomenclatures for chemical entities exist, which makes th...
Tiago Grego, Piotr Pezik, Francisco M. Couto, Diet...
Wikipedia vandalism identification is a very complex issue, which is now mostly solved manually by volunteers. This paper presents the main components of a system built by our grou...
It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we st...