Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

172

WISE
2005
Springer

87views Internet Technology» more WISE 2005»

Identifying Value Mappings for Data Integration: An Unsupervised Approach

16 years 16 days ago

Identifying Value Mappings for Data Integration: An Unsupervised Approach

Download pike.psu.edu

The Web is a distributed network of information sources where the individual sources are autonomously created and maintained. Consequently, syntactic and semantic heterogeneity of data among sources abound. Most of the current data cleaning solutions assume that the data values referencing the same object bear some textual similarity. However, this assumption is often violated in practice. “Two-door front wheel drive” can be represented as “2DR-FWD” or “R2FD”, or even as “CAR TYPE 3” in diﬀerent data sources. To address this problem, we propose a novel two-step automated technique that exploits statistical dependency structures among objects which is invariant to the tokens representing the objects. The algorithm achieved a high accuracy in our empirical study, suggesting that it can be a useful addition to the existing information integration techniques.

Jaewoo Kang, Dongwon Lee, Prasenjit Mitra

Real-time Traffic

Current Data Cleaning | Diﬀerent Data Sources | Individual Sources | Internet Technology | WISE 2005 |

claim paper

Related Content

» Merging multiple criteria to identify suspicious reviews

» The BridgeDb framework standardized access to gene protein and metabolite identifier mappi...

» Biological Data Mining for Genomic Clustering Using Unsupervised Neural Learning

» Establishing value mappings using statistical models and user feedback

» Missing value imputation for epistatic MAPs

» Bayesian ModelAveraging in Unsupervised Learning From Microarray Data

» Semantic integration to identify overlapping functional modules in protein interaction net...

» A fully Bayesian approach to unsupervised partofspeech tagging

» Mapping Nominal Values to Numbers for Effective Visualization

Post Info
More Details (n/a)

Added	25 Jun 2010
Updated	25 Jun 2010
Type	Conference
Year	2005
Where	WISE
Authors	Jaewoo Kang, Dongwon Lee, Prasenjit Mitra

Comments (0)