Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
We survey the emerging area of compression-based, parameter-free, similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extracti...
Through the Internet and the World-Wide Web, a vast number of information sources has become available, which offer information on various subjects by different providers, often i...
Mathematical software libraries provide many computational services. Mathematical operators properties can be used to combine several services in order to provide more complex one...
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...