Abstract. Searching and mining nuclear magnetic resonance (NMR)spectra of naturally occurring substances is an important task to investigate new potentially useful chemical compoun...
Alexander Hinneburg, Andrea Porzel, Karina Wolfram
Deduplication is a key operation in integrating data from multiple sources. The main challenge in this task is designing a function that can resolve when a pair of records refer t...
Recent advances in linear classification have shown that for applications such as document classification, the training can be extremely efficient. However, most of the existing t...
Detecting local clustered anomalies is an intricate problem for many existing anomaly detection methods. Distance-based and density-based methods are inherently restricted by their...
When data sets are analyzed, statistical pattern recognition is often used to find the information hidden in the data. Another approach to information discovery is data mining. Dat...