An important problem in the area of homeland security is to identify abnormal or suspicious entities in large datasets. Although there are methods from data mining and social netwo...
Statistical machine learning techniques for data classification usually assume that all entities are i.i.d. (independent and identically distributed). However, real-world entities...
Visitors enter a website through a variety of means, including web searches, links from other sites, and personal bookmarks. In some cases the first page loaded satisfies the visi...
Justin Brickell, Inderjit S. Dhillon, Dharmendra S...
This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wis...
s In data mining, we emphasize the need for learning from huge, incomplete and imperfect data sets (Fayyad et al. 1996, Frawley et al. 1991, Piatetsky-Shapiro and Frawley, 1991). T...