We present a highly accurate method for classifying web pages based on link percentage, which is the percentage of text characters that are parts of links normalized by the number...
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
This paper presents a real-world application for assisting medical diagnosis and drug prescription, which relies on the exclusive use of machine learning techniques. We have autom...
Determining whether two terms in text have an ancestor relation (e.g. Toyota and car) or a sibling relation (e.g. Toyota and Honda) is an essential component of textual inference ...
Previous work on spatio-temporal analysis of news items and other documents has largely focused on broad categorization of small text collections by region or country. A system fo...