Approximate join is an important part of many data cleaning and integration methodologies. Various similarity measures have been proposed for accurate and efficient matching of st...
Chinese abbreviations are widely used in modern Chinese texts. Compared with English abbreviations (which are mostly acronyms and truncations), the formation of Chinese abbreviati...
Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure give...
Abstract. A large volume of data with complex structures is currently represented in GML (Geography Markup Language) for storing and exchanging geographic information. As the size ...
We propose "Web Engineering 2.0" to not focus anymore on how to engineer for the Web, but how to engineer the Web. Web Engineering has become one of the core disciplines...