We develop data structures for dynamic closest pair problems with arbitrary (not necessarily geometric) distance functions, based on a technique previously used by the author for ...
Collections are a fundamental tool for reproducible evaluation of information retrieval techniques. We describe a new method for distributing the document lengths and term counts ...
Program authorship attribution—identifying a programmer based on stylistic characteristics of code—has practical implications for detecting software theft, digital forensics, a...
Nathan E. Rosenblum, Xiaojin Zhu, Barton P. Miller
Background: Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more importa...
We present Darwin, an enabling technology for mobile phone sensing that combines collaborative sensing and classification techniques to reason about human behavior and context on ...