Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications, especially for Internet classification tasks like review spam...
This paper describes our participation in the 2008 TREC Blog track. Our system consists of 3 components: data preprocessing, topic retrieval, and opinion finding. In the topic ret...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
As a platform-independent solution, XML is going to be used in many environments such as application integration and Web Services. Security of XML instance is a basic problem, esp...
One of the outcomes of Moore's Law - according to which the exponential growth of technical advances in computer science is pushing more and more of our computers into obsole...
Thimoty Barbieri, Franca Garzotto, Giovanni Beltra...