Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
Deduplication, a key operation in integrating data from multiple sources, is a time-consuming, labor-intensive and domainspecific operation. We present our design of alias that us...
Duplicate elimination is an important stage in integrating data from multiple sources. The challenges involved are finding a robust deduplication function that can identify when t...
The great majority of genetic programming (GP) algorithms that deal with the classification problem follow a supervised approach, i.e., they consider that all fitness cases availab...
Junio de Freitas, Gisele L. Pappa, Altigran Soares...
Data mining techniques have become central to many applications. Most of those applications rely on so called supervised learning algorithms, which learn from given examples in th...