Address standardization with latent semantic association

16 years 2 months ago

Download www-ai.cs.uni-dortmund.de

Address standardization is a very challenging task in data cleansing. To provide better customer relationship management and business intelligence for customer-oriented cooperates, millions of free-text addresses need to be converted to a standard format for data integration, de-duplication and householding. Existing commercial tools usually employ lots of hand-craft, domain-specific rules and reference data dictionary of cities, states etc. These rules work better for the region they are designed. However, rule-based methods usually require more human efforts to rewrite these rules for each new domain since address data are very irregular and varied with countries and regions. Supervised learning methods usually are more adaptable than rule-based approaches. However, supervised methods need large-scale labeled training data. It is a labor-intensive and time-consuming task to build a large-scale annotated corpus for each target domain. For minimizing human efforts and the size of labe...

Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang,

Real-time Traffic

Address Standardization Model | Data Mining | Free-text Address Standardization | KDD 2009 | LaSA Model |

claim paper

» Web usage mining based on probabilistic latent semantic analysis

» Unsupervised Image Layout Extraction

» Color patterns for pictorial content description

» Text segmentation via topic modeling an analytical study

» Holistic Sentiment Analysis Across Languages Multilingual Supervised Latent Dirichlet Allo...

» A probabilistic topicconnection model for automatic image annotation

» Semantic Modeling of Digital Multimedia

» Arithmetical Complexity of Firstorder Predicate Fuzzy Logics Over Distinguished Semantics

Post Info
More Details (n/a)

Added	25 Nov 2009
Updated	25 Nov 2009
Type	Conference
Year	2009
Where	KDD
Authors	Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su

Comments (0)

Sciweavers

Address standardization with latent semantic association

Address Standardization Model | Data Mining | Free-text Address Standardization | KDD 2009 | LaSA Model |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers