It has long been noted that many data mining algorithms can be built on top of join algorithms. This has lead to a wealth of recent work on efficiently supporting such joins with ...
Lexiang Ye, Xiaoyue Wang, Dragomir Yankov, Eamonn ...
Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since te...
Zhengzheng Xing, Jian Pei, Guozhu Dong, Philip S. ...
This paper presents a stagewise least square (SLS) loss function for classification. It uses a least square form within each stage to approximate a bounded monotonic nonconvex los...
We define and solve the problem of "distribution classification", and, in general, "distribution mining". Given n distributions (i.e., clouds) of multi-dimensi...
Yasushi Sakurai, Rosalynn Chong, Lei Li, Christos ...
Multiple view data, which have multiple representations from different feature spaces or graph spaces, arise in various data mining applications such as information retrieval, bio...
Graph data such as chemical compounds and XML documents are getting more common in many application domains. A main difficulty of graph data processing lies in the intrinsic high ...
In this paper, we propose an efficient and effective method to find arbitrarily oriented subspace clusters by mapping the data space to a parameter space defining the set of possi...
Covariate shift is a situation in supervised learning where training and test inputs follow different distributions even though the functional relation remains unchanged. A common...
Yuta Tsuboi, Hisashi Kashima, Shohei Hido, Steffen...
Web spam, which refers to any deliberate actions bringing to selected web pages an unjustifiable favorable relevance or importance, is one of the major obstacles for high quality ...