Most data integration applications require a matching between the schemas of the respective data sets. We show how the existence of duplicates within these data sets can be exploi...
We describe an ensemble approach to learning1 salient regions from data partitioned according to the2 distributed processing requirements of large-scale sim-3 ulations. The volume...
Larry Shoemaker, Robert E. Banfield, Larry O. Hall...
— We examine the social behaviors of game experts in Everquest II, a popular massive multiplayer online role-playing game (MMO). We rely on Exponential Random Graph Models (ERGM)...
David Huffaker, Jing (Annie) Wang, Jeffrey William...
While web pages sent over HTTP have no integrity guarantees, it is commonly assumed that such pages are not modified in transit. In this paper, we provide evidence of surprisingly...
Charles Reis, Steven D. Gribble, Tadayoshi Kohno, ...
Estimating insurance premia from data is a difficult regression problem for several reasons: the large number of variables, many of which are discrete, and the very peculiar shape...
Nicolas Chapados, Yoshua Bengio, Pascal Vincent, J...