Sciweavers

CAISE
2010
Springer

Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources

14 years 1 months ago
Probabilistic Models to Reconcile Complex Data from Inaccurate Data Sources
There is a large amount of data that is published on the Web and several techniques have been developed to extract and integrate data from Web sources. However, Web data are inherently imprecise and uncertain and even if novel approaches to deal with the uncertain data have been recently proposed, they assume that the data are provided with an associated uncertain degree. This paper addresses the issue of characterizing the uncertainty of data extracted from a number of inaccurate sources. We developed a probabilistic model to compute a probability distribution for the extracted values, and the accuracy of the sources. Our model considers the presence of sources that copy their contents from other sources, and can deal with the misleading consensus produced by copiers. Our model extends the models previously proposed in the literature by working on several attributes at a time to better leverage all the available evidence. We also report the results of several experiments on both synth...
Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo,
Added 08 Nov 2010
Updated 08 Nov 2010
Type Conference
Year 2010
Where CAISE
Authors Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti
Comments (0)