Rapid growth of digital data collections is overwhelming the capabilities of humans to comprehend them without aid. The extraction of useful data from large raw data sets is someth...
We derive PAC-Bayesian generalization bounds for supervised and unsupervised learning models based on clustering, such as co-clustering, matrix tri-factorization, graphical models...
— Microarray technology offers a high throughput means to study expression networks and gene regulatory networks in cells. The intrinsic nature of high dimensionality and small s...
Yijuan Lu, Qi Tian, Maribel Sanchez, Jennifer L. N...
The concern about national security has increased significantly since the 9/11 attacks. However, information overload hinders the effective analysis of criminal and terrorist activ...
Hsinchun Chen, Wingyan Chung, Yi Qin, Michael Chau...
Educational data mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from the educational context. This work is a ...