Abstract. In this paper we present a coarse-grained parallel algorithm, CONQUEST, for constructing boundederror summaries of high-dimensional binary attributed data in a distribute...
Recently a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and ide...
Jennifer Neville, Brian Gallagher, Tina Eliassi-Ra...
—Gene expression data usually contain a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best...
Shenghuo Zhu, Dingding Wang, Kai Yu, Tao Li, Yihon...
We define the concept of dependence among multiple variables using maximum entropy techniques and introduce a graphical notation to denote the dependencies. Direct inference of in...
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...