There has usually been a clean separation between networks and the applications that use them. Applications send packets over a simple socket API; the network delivers them. Howev...
Kok-Kiong Yap, Te-Yuan Huang, Ben Dodson, Monica S...
Improving data quality is a time-consuming, labor-intensive and often domain specific operation. Existing data repair approaches are either fully automated or not efficient in int...
Mohamed Yakout, Ahmed K. Elmagarmid, Jennifer Nevi...
Abstract Irregularities are widespread in large databases and often lead to erroneous conclusions with respect to data mining and statistical analysis. For example, considerable bi...
Siu-Tong Au, Rong Duan, Siamak G. Hesar, Wei Jiang
Today's record matching infrastructure does not allow a flexible way to account for synonyms such as "Robert" and "Bob" which refer to the same name, and ...
Analyzing the quality of data prior to constructing data mining models is emerging as an important issue. Algorithms for identifying noise in a given data set can provide a good me...
Jason Van Hulse, Taghi M. Khoshgoftaar, Haiying Hu...