Essentially all data mining algorithms assume that the datagenerating process is independent of the data miner's activities. However, in many domains, including spam detectio...
Nilesh N. Dalvi, Pedro Domingos, Mausam, Sumit K. ...
IP packet streams consist of multiple interleaving IP flows. Statistical summaries of these streams, collected for different measurement periods, are used for characterization of ...
Edith Cohen, Nick G. Duffield, Haim Kaplan, Carste...
In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...
Public data sharing is utilized in a number of businesses to facilitate the exchange of information. Privacy constraints are usually enforced to prevent unwanted inference of info...
We present the GeoStar project at RPI, which researches various terrain (i.e., elevation) representations and operations thereon. This work is motivated by the large amounts of hi...
W. Randolph Franklin, Metin Inanc, Zhongyi Xie, Da...