Researchers in the data mining area frequently have to spend significant portion of their time on preprocessing the data in order to apply their algorithms to real-world datasets...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
A lift curve, with the true positive rate on the y-axis and the customer pull (or contact) rate on the x-axis, is often used to depict the model performance in many data mining ap...
In many application domains (e.g., WWW mining, molecular biology), large string datasets are available and yet under-exploited. The inductive database framework assumes that both s...
Many social Web sites allow users to annotate the content with descriptive metadata, such as tags, and more recently to organize content hierarchically. These types of structured ...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
In plenty of scenarios, data can be represented as vectors mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining app...